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(54) Title: GENE EXPRESSION IN BIOLOGICAL CONDITIONS 

(57) Abstract: The invention concerns a method of determining the presence or absence of a biological condition in humans, in 
particular of colon cancer, and of determining the stage of a condition in human tissue by defeeimining an expression pattern of a 
^® cell sample. Further, the invention relates to a method of determining the presence or absence of a biological condition in human 
tissue, and of determining the stage of a biological condition in human tissue, and also for reducing biological abnormalities of a cell 
sufifering from the biological condition. A method for producing antibodies against an expression product of a cell from the tissue 
is also described. The invention also discloses a pharmaceutical composition for the treatment of a biological condition comprising 
at least one antibody, and a vaccine for the proi^ylaxis or treatment of a biological condition. Further the invention describes the 
use of a method for {xoducing an assay for diagnosing a biological condition in human tissue, the use of a peptide or a gene or a 
probe for the preparation of a phamiaceutical composition for the treatment of a biological condition in human tissue, and an assay 
for determining the presence or absence of biological condition in human tissue and fordetemiining an expression pattern of a cell. 
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Gene expression in biological conditions 

Technical field of the invention 

5 The present invention relates to method of determining the presence or absence of 
a biological condition in animal tissue, wherein the expression of genes in normal 
tissue and tissue from the biological condition is examined and correlated to 
standards. The invention further relates to treatment of the biological condition and 
an assay for determining the condition. 

10 

Background 

The building of large databases containing human genome sequences is the basis 
for studies of gene expressions in various tissues during normal physiological and 

15 pathologic conditions. Constantly (constitutively) expressed sequences as well as 
sequences whose expression is altered during disease processes are important for 
our understanding of cellular properties, and for the identification of candidate genes 
for future therapeutic intervention. As the number of known genes and ESTs build 
up in the databases, array-based simultaneous screening of thousands of genes is 

20 necessary to obtain a profile of transcriptional behaviour, and to identify key genes 
that either alone or in combination with other genes, control various aspects of 
cellular life. One cellular behaviour that has been a mystery for many years is the 
malignant behaviour of cancer cells. We now know that for example defects in DNA 
repair can lead to cancer but the cancer-creating mechanism in heterozygous 

25 individuals is still largely unknown as is the malignant cell's ability to repeat cell 
cycles to avoid apoptosis to escape the immune system to invade and metastasize 
and to escape therapy. There are hints and indications in these areas and excellent 
progress has been made, buth the myriad of genes interacting with each other in a 
highly complex multidimensional network is making the road to insight long and 

30 contorted. 

Similar appearing tumors - morphologically, histochemically, microscopically - can 
be profoundly different. They can have a different invasive and metastasizing 
properties, as well as respond differently to therapy. There is thus a need in the art 
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for methods which distinguish tumors and tissues on different bases than are 
currently in use in the clinic. 

The malignant transformation from normal tissue to cancer is believed to be a 
5 multistep process, in which tumorsuppressor genes, that normally repress cancer 
growth show reduced gene expression and in which other genes that encode tumor 
promoting proteins (oncogenes) show an increased expression level. Several tumor 
suppressor genes have been identified up till now, as e.g. pi 6, Rb, p53 ( Nesrin 
Ozoren and Wafik S. El-Delry, introduction to cancer genes and growth control. In: 
10 DNA alterations in cancer, genetic and epigenetic changes, Eaton publishing, 
Melanie Ehrlich (ed) p. 1-43, 2000.; and references therein). 
They are usually identified by their lack of expression or their mutation in cancer 
tissue. 

15 Other examinations have shown this downregulation of transcripts to be partly due 
to loss of genomic material ( loss of heterozygosity), partly to methylation of promo- 
torregions, and partly due to unknown factors ( Nesrin Ozoren and Wafik S. El- 
Deiry, Introduction to cancer genes and growth control. In: DNA alterations in can- 
cer, genetic and epigenetic changes, Eaton publishing, Melanie Ehriich (ed) p. 1-43, 

20 2000.; and references therein). 

Several oncogenes are known, e.g. cyclinD1/PRAD1/BCL1, FGFs, c-MYC, BCL-2 
all of which are genes that are amplified in cancer showing an increased level of 
transcript ( Nesrin Ozdren and Wafik S. El-Deiry, Introduction to cancer genes 
25 and growth control, In: DNA alterations in cancer, genetic and epigenetic changes, 
Eaton publishing, Melanie Ehriich (ed) p. 1-43, 2000.; and references therein). Many 
of these genes are related to cell growUi and directs the tumor cells to uninhibited 
growth. Others may be related to tissue degradation as they e.g. encode enzymes 
that break down the surrounding connective tissue. 

30 

Summary of the invention 

In one aspect the present invention relates to a method of determining the presence 
or absence of a biological condition in animal tissue comprising 

35 
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collecting a sample comprising cells from the tissue and/or expression products 
from the cells. 

assaying a first expression level of at least one gene from a first gene group, 
5 wherein the gene from the first gene group is selected from genes expressed in 

normal tissue cells in an amount higher than expression in biological condition 
ceils, and/or 

assaying a second expression level of at least one gene from a second gene 
0 group, wherein the second gene group is selected from genes expressed in a 

normal tissue cells in an amount lower than expression in biological condition 
cells, 

correlating the first expression level to a standard expression level for normal 
5 tissue, and/or the second expression level to a standard expression level for 

biological condition cells to determine the presence or absence of a biological 
condition in the animal tissue. 

Animal tissue may be tissue from any animal, preferably from a mammal, such as a 
0 horse, a cow. a dog, a cat, and more preferably the tissue is human tissue. The 

biological condition may be any condition exhibiting gene expression different from 
normal tissue. In particular the biological condition relates to a malignamt or prema- 
lignant condition, such as a tumor or cancer. 

5 Furthermore, the invention relates to a method of determining the stage of a bio- 
logical condition in animal tissue, 

comprising collecting a sample comprising cells from the tissue, 

0 assaying the expression of at least a first stage gene from a first stage gene 

group and at least a second stage gene from a second stage gene group, 
wherein at least one of said genes is expressed in said first stage of the condi- 
tion in a higher amount than in said second stage, and the other gene is a ex- 
pressed in said first stage of the condition in a lower amount than in said second 

5 stage of the condition, 
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correlating the expression level of the at least two genes to a standard level of 
expression determining the stage of the condition. 

5 Thereby, it is possible to determine the biological condition in more details, such as 
determination of a stage and/or a grade of a tumor. 

The methods above may be used for determining single gene expressions, however 
the invention also relates to a method of determining an expression pattern of a co- 
10 Ion cell sample, comprising: 

collecting sample comprising colon and/or rectum cells and/or expression prod- 
ucts from colon and/or rectum cells, 

1 5 determining the expression level of two or more genes in the sample, wherein at 

least one gene belongs to a first group of genes, said gene from the first gene 
group being expressed in a higher amount in normal tissue than in biological 
condition ceils, and wherein at least one other gene belongs to a second group 
of genes, said gene from the second gene group being expressed in a lower 

20 amount in normal tissue than in biological condition cells, and the difference 

between the expression level of the first gene group in normal cells and biologi- 
cal condition cells being at least two-fold, obtaining an expression pattern of the 
colon and/or rectum cell sample. 

25 Gene expression patterns may rely on one or a few genes, but more preferred gene 
expression patterns relies on expression from multiple genes, whereby a combined 
information from several genes is obtained. 

Further, the invention relates to a method of determining an expression pattern of a 
30 colon cell sample independent of the proportion of submucosal, muscle, or connec- 
tive tissue cells present, comprising: 



35 



determining the expression of one or more genes in a sample comprising cells, 
wherein the one or more genes exclude genes which are expressed in the sub- 
mucosal, muscle, or connective tissue, whereby a pattern of expression is 
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formed for the sample which is independent of the proportion of submucosal, 
muscle, or connective tissue cells in the sample. 

The expression pattern may be used in a method according to this information, and 
5 accordingly, the invention also relates to a method of determining the presence or 
absence of a biological condition in human colon and/or rectum tissue comprising, 

collecting a sample comprising cells from the tissue, 

1 0 determining an expression pattern of the cells as defined above, 

correlating the determined expression pattern to a standard pattern, 

determining the presence or absence of the biological condition is said tissue. 

15 

as well as a method for determining the stage of a biological condition in animal tis- 
sue, comprising 

collecting a sample comprising ceils from the tissue, 

20 

determining an expression pattern of the cells as defined above, 

correlating the determined expression pattern to a standard pattern, 

25 determining the stage of the biological condition is said tissue. 

The invention further relates to a method for reducing cell tumorigenicity of a cell, 
said method comprising 

30 contacting a tumor cell with at least one peptide expressed by at least one gene 
selected from genes being expressed in an amount two-fold higher in normal cells 
than the amount expressed in said tumor cell, or 

comprising 

35 
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obtaining at least one gene selected from genes being expressed in an amount two- 
fold higher in normal cells than the amount expressed in said tumor cell, 

introducing said at least one gene into the tumor cell in a manner allowing 
5 expression of said gene(s), or 

obtaining at least one nucleotide probe capable of hybridising with at least one gene 
of a tumor cellp said at least one gene being selected from genes t)eing expressed in 
an amount one-fold lower in normal cells than the amount expressed in said tumor 
10 cell, and 

introducing said at least one nucleotide probe into the tumor cell in a manner 
allowing the probe to hybridise to the at least one gene, thereby inhibiting 
expression of said at least one gene. 

15 

In a further aspect the invention relates to a method for producing antibodies against 
an expression product of a cell from a biological tissue, said method comprising the 
steps of 

20 obtaining expression product(s) from at least one gene said gene being expressed 
as defined above, 

immunising a mammal with said expression product(s) obtaining antibodies against 
the expression product. 

25 

The antibodies produced may be used for producing a pharmaceutical composition. 
Further, the invention relates to a vaccine capable of eliciting an immune response 
against at least one expression product from at least one gene said gene being ex- 
pressed as defined above. 

30 

The invention furthermore relates to the use of any of the methods discussed at>ove 
for producing an assay for diagnosing a biological condition in animal tissue. 

Also, the invention relates to the use of a peptide as defined above as an expression 
35 product and/or the use of a gene as defined above and/or the use of a probe as 
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defined above for preparation of a pharmaceutical composition for the treatment of a 
biological condition in animal tissue. 

In a yet further aspect the invention relates to an assay for determining the presence 
5 or absence of a biological condition in animal tissue, comprising 

at least one first marker capable of detecting a first expression level of at least 
one gene from a first gene group, wherein the gene from the first gene group is 
selected from genes expressed in normal tissue cells in an amount higher than 
10 expression in biological condition cells, 

at least one second marker capable of detecting a second expression level of at 
least one gene from a second gene group, wherein the second gene group is 
selected from genes expressed in normal tissue cells in an amount lower than 
15 expression in biological condition cells. 

In another aspect the invention relates to an assay for determining an expression 
pattern of a colon and/or rectum cell, comprising at least a first marker and a second 
marker, wherein the first marker is capable of detecting a gene from a' first gene 
20 group as defined above, and the second marker is capable of detecting a gene from 
a second gene group as defined above. 

Detailed description of the Invention 

25 Samples 

The samples according to the present invention may be any tissue sample, it is 
however often preferred to conduct the methods according to the invention on 
epithelial tissue, such as epithelial tissue from the gastro-intestinal tract, in particular 
30 form colon and/or rectum. In particular the epithelial tissue may be mucosa. 

The sample may be obtained by any suitable manner known to the man skilled in 
the art, such as a biopsy of the tissue, or a superficial sample scraped from the tis- 
sue. The sample may be prepared by forming a cell suspension made from the tis- 
35 sue, or by obtaining an extract from the tissue. 
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15 



20 



25 



30 



In one embodiment it is prefened that the sample comprises substantially only cells 
from said tissue, such as substantially only cells from mucosa of the colon-rectum. 

Biological condition 

The methods according to the invention may be used for determining any biological 
condition, wherein said condition leads to a change in the expression of at least one 
gene, and preferably a change in a variety of genes. 

Thus, the biological condition may be any malignant or premalignant condition, in 
particular In colon/rectum, such as an adenocarcinoma, a carcinoma, a teratoma, a 
sarcoma, and/or a lymphoma. 

In relation to the gastro-intestinal tract, the biological condition may also be colitis 
ulcerosa. Mb. Crohn, diverticulitis, adenomas. 

Single gene expression contra expression pattern 

The expression level may be determined as single gene approaches, i.e. wherein 
the detemnination of expression from one or two or a few genes is conducted. It is 
preferred that expression from at least one gene from a first (nomial) group Is de- 
temnined, said first gene group representing genes being expressed at a higher level 
in nornial tissue, i.e. so-called suppressors, in combination with detemiination of 
expression of at least one gene from a second group, said second group represent- 
ing genes being expressed at a higher level in tissue from the biological condition 
ttian in nonnal tissue, ie. so-called oncogenes. However, detennination of the ex- 
pression of a single gene whether belonging to the first group or second group Is 
within the scope of the present invention. In this case it is preferred that the single 
gene is selected among genes having a very high change in expression level from 
normal cells to biological condition cells. 

Another approach is detennination of an expression pattern from a variety of genes, 
wherein the detennination of the biological condition In the tissue relies on infonna- 
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10 



tion from a variety of gene expression, i.e. rather on the combination of expressed 
genes than on the infomiation from single genes. 

Colorectal tumors 

The .oltowing data presented herein relates to co.o,«.a. ^'^ '^'^'^ 

descnpUon has focused on the gene exp«sslon tevel as one way ol KJentrfyng 

t-. tunc^on .n cancer «ssue. Genes showing a r-^y^^ 
L ,o, con,p.e.e loss, o. exp^sslon level - measured as the mRNA Uanscnpt. 
during the malignant progression in colon from nom»l mucosa through Dukes A 
supeLal tumors to DuKes 8. slighfy invasive tumors, to DuKes C that have spread 
,o lymphnodes and nnally to Dukes D mat have metasmslzed to other organs, lias 
Jn examined, as well as genes flalnma importance during the di«eren.«tK>n to- 
wards malignancy. 

15 

Gene groups 

The present invenUor relates to a variety o. genes kientifled either by an EST identi- 
floatlon num.«r and/or by a gene idenUtfcation number. Bom -VP^ »' 
20 numbers relates to ktentlflcatlon numbers ol UniGene database. NCBI. bUild 18. 

The various genes have been Identilled using Alfymetrix arrays ol me following 
product numbers: 

25 Human Gene FL array 900 1 83 
HU35KSubA900184 
HU35KSubB900 185 
HU35K SubC900 186 
HU35KSubD900 1 87 



30 



First gene group 



35 



The first gene group relates to genes being expressed in normal tissue cells .n an 
Hount higher than expression in biological condition cells. The term "norma. t.ssue 
cells- relates to cells from the same type of tissue that is examined with respect to 
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the biological condition in question. Thus, with respect to colorectal tumors, the 
normal tissue relates to colorectal tissue, in particular to colorectal mucosa. 

The first gene group therefore relates to genes being downregulated in tumors, such 
5 genes being expected to serve as tumor suppressor genes, and they are of impor- 
tance as predictive markers for the disease as loss of one or more of these may 
signal a poor outcome or an aggressive disease course. Furthermore, they may be 
important targets for therapy as restoring their expression level, e.g. by gene ther- 
apy, may suppress the malignant growth. 

10 



For a colorectal tissue sample a gene from the first gene group is preferably se- 
lected individually from genes comprising a sequence as identified below by EST 



UniGene number 


Homologous to 


RC_H04768_at 




chrom 15 no homology 


RC_Z39652_at 




Y14593 APM-1 gene adipocyte-specific se- 
cretory protein; chrom 1q21,3'Q23 


pq^H3pg70_at „ : . c 




chrom 18 PAAAA in colon & bladder no 

homology 


RC_T47089_s_at 




tenascin-X; tenascin-X precursor; unidenti- 


RC_W31906_at 




secretagogin; dJ501N12.8 (putative protein) 
chrom 6 


RC_AA279803_at ^ i , . 




chrom 2 no homology 


RC$.RW646|a]^^^^p 




chrom 13q32.1'33.3 ; AL1S9152 ; homolo- 
gy to mouse Pcbpl - poly(rC)'binding 
protein 1 


RC_AA099820_at 




BAC clone AC016778 


AA319615_at 




secretory carrier membrane protein; secre- 
tory carrier membrane protein 2; chrom 15 


H07011_at 




tetraspan NET-6 mRNA; transmembrane 
4 superfamlly; chrom 7 


RC_T68873_f_at 






RC_T40995_f_at 






RC_H81070_f_at 






RC_N30796_at 






RC_W37778_f_at 






RC_R70212_s_at 






RC_AA426330_at 






RC_N33927_s_at 






RC_T90190_s_at 






RC_AA447145_at 






RC_H75860_at 






RC_T71132 s_at 
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and from genes comprising a sequence as identified below 



"Human chromogranin A -"mRNA,"" complete cds" J03915 

Human adipsin/complement factor D "mRNA,*" comple- M84526 
tecds 

Homo sapiens MLC-1 V/Sb Isof orm gene M24248 

Human aminopeptldase N/CD1 3 mRNA encoding M22324 
aminopeptldase "N," complete cds 

H.sapiens MT-1 1 mRNA X76717 

H.sapiens GCAP-II gene Z70295 

Human somatostatin I gene and flanks J00306 

Human YMP "mRNA/ complete cds U52101 

H.sapiens mRNA for beta subunit of epithelial amiloride- X87159 
sensitive sodium channel 

Human K12 protein precursor "mRNA," complete cds U77643 

Human sulfate transporter (DTD) "mRNA," complete cds U14528 

Human transcription factor hGATA-6 "mRNA," complete U66075 
cds. 

H.sapiens SCAD "gene," exon 1 and joining features Z80345 

Human S-lac lectin L-14-II (LGALS2) gene M87860 

Human mRNA for protein tyrosine phosphatase D15049 

H.sapiens mRNA for tetranectin X64559 

Human 1 1kd protein "nlRNA." complete cds U28249 

Human anti-mullerian hormone type II receptor precursor U29700 
"gene," complete cds 

Human heparin binding protein (HBp17) "mRNA," complete M60047 
cds 

Human ADP-ribosylation factor (hARF6) "mRNA." complete M57763 
cds 

beta -ADD=adducin beta subunit 63 kda isoform/membrane S81083 
skeleton protein, beta -ADD=adducin beta subunit 63 kda 
isoform/membrane skeleton protein {alternatively spliced, 
exon 10 to 13 region} [human. Genomic, 1851 nt, segment 
3 of 3]. 

Zinc Finger Protein Znf 1 55 HG4243- 

HT4513 

Human glucagon "mRNA," complete cds J04040 

H.sapiens mRNA for hair "keratin," hHbS X991 40 

Human tubulin-folding cofactor E "mRNA." complete cds U61 232 

Human integrin alpha-3 chain "mRNA," complete cds M5991 1 

Human NACP gene U46901 

H.sapiens mRNA for flavin-containing monooxygenase 5 Z47553 
(FM05) 

Human mRNA for ATF-a transcription factor X52943 

H.sapiens intestinal VIP receptor related protein mRNA yj7777 



and and from genes comprising a sequence as identified below 

AF001548 

Homo sapiens chromosome 16 BAG clone CIT987SK- 
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815A9 complete sequence. 

Human mRNA for ATP synthase alpha "subunit," complete D14710 
cds 

Human mRNA for IgG Fc binding "protein," complete D84239 
cds 

H.sapiens mRNA for carcinoembryonic "antigen," X9831 1 
CGiVI2 

"Homo sapiens (clone lamda-hPEC-3) phosphoenolpy- L05144 

ruvate carboxykinase (PCK1) ""mRNA."" complete 

cds" 

Human 1 1 -beta-hydroxysteroid dehydrogenase type 2 U26726 
"mRNA/ complete cds 

"Human intestinal mucin (MUC2) ""mRNA,"" complete cds" L21998 
Human mRNA for KIAA0106 "gene," complete cds D14662 
metallothionein V00594 
Human mRNA for IgG Fc binding "protein," complete D84239 
cds 

H.sapiens mRNA'for carcinoembryonic "antigen," X9831 1 
CGM2 

"Homo sapiens (clone lamda-hPEC-3) phosphoenolpy- L05144 

ruvate carboxykinase (PCK1) ""mRNA,"" complete 

cds" 

metallothionein V00594 



In a preferred embodiment a gene from the first gene group Is preferably selected 
Individually from genes comprising a sequence as identified below by EST 

5 



UniGene number 


Homologous to 


RC_H04768_at 




chrom 15 no homology 


RC_Z39652_at 




Y14593 APM'1 gene adipocyte-specific se- 
cretory protein; chrom 1q21.3'q23 


RC_H30270_at 




chrom 18 PAAAA in colon & bladder no 
homology 


RC:_AA279803_at 




chrom 2 no homology 


RQifl01646i3t V ^ 




chrom 13q32.1-33.3 ; AL159152 ; homolo- 
gy to mouse Pcbpl - poly(rC)'binding 
protein 1 


RC_AA099820_at 




BAC clone AC016778 



and from genes comprising a sequence as identified below 

10 

"Human chromogranin A ""mRNA,"" complete cds" J03915 
Human adipsin/complement factor D "mRNA," comple- M84526 
te cds 

Homo sapiens MLC-IV/Sb isoform gene M24248 
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Human aminopeptidase N/CD1 3 mRNA encoding M22324 
aminopeptidase "N," complete cds 

H.sapiens MT-1 1 mRNA X7671 7 

H.sapiens GCAP-il gene Z70295 

Human somatostatin i gene and flanks J00306 



or selected individually from genes comprising a sequence as identified below by 
EST 



5 



UniGene number 


Homologous to 


RC H04768 at . . 




chrom 15 no homology 


RC Z39652rar ?i . 




Y14593 APM-1 gene adipocyte-specific se- 

creiory proiGin, cnrom / Qid i . o-Qizsj 


RC_H30270_at 




chrom 18 PAAAA in colon & bladder no 

homology 


RC_T47089_s_at 




tenascin-X; tenascin-X precursor; unidenti- 
fied protein 


RC_W31906_at 




secretagogin; dJ501N12.8 (putative protein) 
chrom 6 


RC_AA279803_at 




chrom 2 no homology 


RC R01646_at, 




chrom 13q32.1'33.3 ; AL159152 ; homolo- 
gy to mouse Pcbpl - poly(rC)'binding 
protein 1 


RC_AA099820_at 




BAC clone AC016778 


AA319615_at 




secretory carrier membrane protein; secre- 
tory carrier membrane protein 2; chrom 15 


H07011_at 




tetraspan NET-6 mRNA; transmembrane 
4 superfamily; chrom 7 



In a more preferred embodiment a gene from the first gene group is selected indi 
vidually from genes comprising a sequence as identified below by EST 
UniGene number Homologous to 

10 



RC^H04768_at - t<-} : 




chrom 15 no homology 


RCiZ39652>t . v ^ ^ 




Y14593 APM'1 gene adipocyte-specific se- 
cretory protein; chrom lQ21.3'q23 


RGlH30270_at 




chrom 18 PAAAA in colon & bladder no 
homology 


RC_T47089_s_at 




tenascin-X; tenascin-X precursor; unidenti- 
fied protein 


RC_W31906_at 




secretagogin; dJ501N12.8 (putative protein) 
chrom 6 


RC_AA279803_at 




chrom 2 no homology 


RC_R01646_at 




chrom 13q32.1-33.3 ; AL159152 ; homolo- 
gy to mouse Pcbpl - poly(rC)'binding 
protein 1 
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AA319615.at 



secretory carrier membrane protein; secre- 
tory carrier membrane protein 2; chrom 15 



In a most preferred embodiment a gene from the first gene group is selected indi- 
vidually from genes comprising a sequence as identified below by EST 
UniGene number Homologous to 



RC_T47089_s„at 




tenascin-X; tenascin-X precursor; unidenti- 
fied protein 


RC_W31906_at 




secretagogin; dJ501N12.8 (putative protein) 
chrom 6 


RC_AA279803lat : 




chrom 2 no homology 


AA319615_at 




secretory carrier membrane protein; secre- 
tory carrier membrane protein 2; chrom 15 



Second gene group 

10 

We have determined genes that are up-regulated (or gained de novo) during the 
malignant progression of colorectal cancer from normal tissue through Dukes A,B,C 
and to Dukes D. These genes are potential oncogenes and may be those genes that 
create or enhance the malignant growth of the cells. The expression level of these 

15 genes may serve as predictive markers for the disease course, as a high level may 
signal an aggressive disease course, and they may serve as targets for therapy, as 
blocking these genes by e.g. anti-sense therapy, or by biochemical means could 
inhibit, or slow, the tumor growth. Such up-regulated (or gained de novo) genes, 
oncogenes, may be classified according to the present invention as genes belonging 

20 to second genes group. 

With respect to colorectal tumors genes belonging to the second gene group are 
preferably selected individually from genes comprising a sequence as identified be- 
low by EST 

25 

UniGene number Homologous to 



RC_AA609013_s_at 




microsomal dipeptidase (also on 6.8k); 
chrom 16 


RC_AA232508_at 




CGI-89 protein; unnamed protein product; 
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hypottietical protein 


RC_AA428964_at 




serine protease-like protease; serine pro- 
tease liomoiog=NES1 ; normal epithelial cell- 
specific 1 


RC_T52813_s_at 




dJ28O10.2 (G0S2 (PUTATIVE LYMPHO- 
CYTE G0/G1 SWITCH PROTEIN 2; chrom 
1 


RC_AA075642_at 




gp-340 variant protein; DMBT1/8kb.2 protein 


RCi:/fA0d7218_at 




chrom 13 no homology 


RC_N33920_at 




ubiquitin-like protein FAT10; diubiquitin; 
CU271 M21 .6 (Diubiquitin); chrom 6 


RC_N71781_at 




KIAA1199 protein, chrom 15 


RC_R67275_s_at 




alpha-1 (type XI) collagen precursor; colla- 
gen, type XI, alpha 1 ; collagen type XI alp- 
ha-1 Isoform A; chrom 1 


RC_W80763_at 




hypothetical protein; chrom 17 


RC_AA443793_at ^ 




chrom 7p22 AC006028 BAC done 


RC_AA034499_s_at 




ZNF198 protein; zinc finger protein; FIM 
protein; Cys-rich protein; zinc finger protein 
198; chrom 13 


RC_AA035482_at 




chrom 5; AK022505 clone; CalcineurinB 
(weakly similar) 


RC_AA024482_at 




hypothetical protein; unnamed protein pro- 
duct; chrom 17 


RC_H93021_at 




chrom 2 ; XM_004890 peptidylprolyl isome- 
rase A (cyclophilin A) 


RC_AA427737_at 




no homology 


RC_AA417078_at 




chrom 7g31: AF017104 clone 


M29873_s_at 

- 




cytochrome P450-IIB (hllB3) ; 19q13.1- 
q13.2 


RC_H27498_f_at 






RC_T92363_s_at 






RC_N89910_at 






RC_W60516_at 






RC_AA219699_at 






RC_AA449450_at 







and from genes comprising a sequence as identified below 

Homo sapiens (clones "MDP4/' MDP7) microsomal J05257 

dipeptidase (IVIDP) "mRNA," complete cds 

"Homo sapiens reg gene ""homologue,"" complete L08010 

cds" 

H.sapiens mRNA for prepro-alpha2(l) collagen Z74616 

"Human S-adenosylhomocysteine hydrolase (AHCY) M61832 
""mRNA,"" complete cds" 

Transcription Factor Ilia HG4312- 

HT4582 

Human gene for melanoma growth stimulatory activity X54489 
(MGSA) 

Human stromelysin-3 mRNA X57766 
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CDC25Hu2=cdc25+ homolog "(human," "mRN A." 31 1 8 nt] S781 87 

Human mRNA for cripto protein XI 4253 
Human transformation-sensitive protein (lEF SSP 3521) IUI86752 
"mRNA," complete cds 

Human complement component 2 (C2) gene allele b L09708 

H. sapiens mRNA for ITBA2 protein X92896 

H.sapiens encoding CLA-1 mRNA Z22555 

"Human fibroblast growth factor receptor 4 {FGFR4) L03840 
""mRNA,"" complete cds" 
"-"Fibronectin,"" Alt Splice 1" 



HG3044- 
HT3742 
X54667 
XI 3293 
U24183 



tyk2 

Human mRNA for B-myb gene 

"Human phosphofructokinase (PFKM) ""mRNA,"" complete 
cds" 

Human pre-B cell enhancing factor (PBEF) "mRNA," com- U02020 
plete cds 

Human SH2-containing inositol 5-phosphatase (hSHIP) U57650 
"mRNA," complete cds 

Human interleul<in 8 (iL8) "gene," complete cds 

"Human lamin B receptor (LBR) ""mRNA,"" complete cds 
H.sapiens mRNA for protein tyrosine phosphatase 
Human mRNA for unc-18 "homologue," complete cds 
H.sapiens mRNA for Zn-alpha2-glycoprotein 



"Human asparagine synthetase ""mRNA."" complete cds" 
Human hepatitis delta antigen interacting protein A (dipA) 
"mRNA," complete cds 

Human splicesomal protein (SAP 61) "mRNA," complete 

cds 

Human protein kinase C-binding protein RACK7 "mRNA," 
partial cds 

Human MAC30 "mRNA," 3' end 

Human thrombospondin 2 (THBS2) "mRNA." complete cds 
"Human nicotinamide N-methyltransferase (NNMT) 
-"mRNA,"" complete cds" 

H.sapiens mRNA for type i interstitial collagenase 
Human cytochrome b561 gene 

Human HIS RNA "gene," complete cds (spliced In sili- 
co) 

Human collagen type XVIII alpha 1 (COL18A1) "mRNA," 
partial cds 



M28130 

L25931 
Z48541 
D63851 
X59766 
Z25521 
M27396 
U63825. 

U08815 

U48251 

LI 91 83 
LI 2350 
U08021 

X54925 
U29463 
M32053 

L22548 

U79274 



Human transforming growth factor-t>eta induced gene pro- 
duct (B1GH3) "mRNA," complete cds 


M77349 


"Human breast epithelial antigen BA48 ""mRNA,"" com- 
plete cds" 


U58516 




X57351 


H.sapiens NGAL gene 


X99133 


Human mRNA for MDNCF (monocyte-derived neutrop- 
hil chemotactic factor) 


Y00787 


H.sapiens EF-1 delta gene encodinq human elongation 


Z21507 
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factor-1 -delta 




H. sapiens mRNA for prepro-aiphal (1) collagen 


Z74615 


Nuclear Factor Nf-ll6 


HG3494- 
HT3688 




U29175 


"HNLsneutrophil lipocalin ""[human,"" ovarian cancer 
cell line -"OC6,"" mRNA ""Partial,"" 534 nt]. 
/«]bsiS75256 /ntypesRNA- 


S76256 



In a preferred embodiment the genes belonging to the second gene group are pref- 
erably selected individually from genes comprising a sequence as identified below 
5 by EST 



UniGene number 


Homologous to 


RC_AA007218_at 




chrom 13 no homology 


RC_AA443793_at 




chrom 7p22 AC006028 BAC clone 


RC^AA035482_at: ^ 




chrom 5; AK022505 clone; CalcineurinB 
(weakly similar) 


RC_H93021_at 




chrom 2 ; XM_004890 peptidylprolyl isome- 
rase A (cyclophilin A) 


RC_AA427737„at 




no homology 


RC AA417078_at 




chrom 7q31; AF017104 clone 



10 and from genes comprising a sequence as identified below 

In another preferred embodiment genes from the second gene group are selected 
individually from genes comprising a sequence as identified below 



UniGene number 


Homologous to 


RC_AA609013_s_at 




microsomal dipeptidase (also on 6.8k); 
chrom 16 


RC_AA232508_at 




CGI-89 protein; unnamed protein product; 
hypothetical protein 


RC_AA428964_at 




serine protease-like protease; serine pro- 
tease homolog=NES1 ; normal epithelial cell- 
specific 1 


RC_T52813_s_at 




dJ28O10.2 (G0S2 (PUTATIVE LYMPHO- 
CYTE G0/G1 SWITCH PROTEIN 2; chrom 
1 


RC_AA075642_at 




gp-340 variant protein; DMBT1/8kb.2 protein 
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RC AA007218 at 




chrom 13 no homoloav 


RC N33920 at 




ubiduitin^like orotein FAXiO* diubinuitin* 
dJ271M21.6 (Diubiquitin); chrom 6 


RC_N71781_at 




KIAA1199 protein, chrom 15 


RC_R67275_s_at 




alpha-1 (type XI) collagen precursor; colla- 
ha-1 isoform A; chrom 1 








RC^AA443793_at 




c/irom 7p22 AC006028 BAC clone 






^iNr i5?o pruiein, ^inu iinycr proicin, niivi 
protein; Cys-rich protein; zinc finger protein 








RC_AA024482_at 




hypothetical protein; unnamed protein pro- 
duct; chrom 17 


Rc_H9302i^at : ' 




chrom 2 ; XM_004890 peptidylprolyl isome- 
rase A (cyclophilin A) 


RC AA427737_at 




no homology 


RC_AA417078_at 




chrom 7q31; AF017104 clone 


M29873_S_at 




Cytochrome P450-liB (hllB3) ; 19q13.1- 
q13.2 



In a more preferred embodiment genes from the second gene group are selected 
individually from genes comprising a sequence as identified below _ 



5 UniGene number Homologous to 



RC_AA60901 3_s_at 




microsomal dipeptidase (also on 6.8k); 
chrom 16 


RC_AA232508_at 




CGI-89 protein; unnamed protein product; 
hypothetical protein 


RC_AA428964_at 




serine protease-like protease; serine pro- 
tease homolog=NES1 ; normal epithelial cell- 
specific 1 


RC_AA075642„at 




gp-340 variant protein; DMBT1/8kb.2 protein 


RC_AA007218_at 




chrom 13 no homology 


RC_N33920_at 




ubiquitin-like protein FAT10; diubiquitin; 
dJ271M21.6 (Diubiquitin); chrom 6 


RC_N71781_at 




KIAA1199 protein, chrom 15 


RC_R67275_s_at 




alpha-1 (type XI) collagen precursor; colla- 
gen, type XI, alpha 1; collagen type XI alp- 
ha-1 isoform A; chrom 1 


RC_W80763_at 




hypothetical protein; chrom 17 


RC_AA034499_s_at 




ZNF198 protein; zinc finger protein; FIM 
protein; Cys-rich protein; zinc finger protein 
198; chrom 13 


RC_AA035482_at | 


chrom 5; AK022505 clone; CalcineurinB 
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(weakly similar) 


RC_AA024482_at 




hypothetical protein; unnamed protein pro- 
duct: chrom 17 


RC_H93021_at 




chrom 2 ; XM_004890 peptidylprolyl isome- 
rase A (cyclophilin A) 


RC AA427737_at 




no homology 


RC_AA417078_at 




chrom 7q3 I. AFOI 7104 done 


M29873_s_at 




cytochrome P450-IIB (hllBS) ; 19q13.1- 
q13.2 



In an even more preferred embodiment genes from the second gene group are se 
lected individually from genes comprising a sequence as identified below 



UnlGene number 


Homologous to 


RC_AA609013_S_at 




microsomal dipeptidase (also on 6.8k); 
chrom 16 


RC AA007218_at 




chrom 13 no homology 


RC_AA035482_at 




chrom 5; AK022505 clone; CalcineurinB 
(weakly similar) 


RpiH93021i:at^>H;sy:^^i 




chrom 2 ; XM_004890 peptidylprolyl isome- 
rase A (cyclophilin A) 


RC AA427737 at 




no homology 


RC AA417078_at 




chrom 7g3 1; AF0 1 7104 clone 



such as a sequence as identified below 



10 UnlGene number Homologous to 

|RC_W80763_at I Ihypothetical protein; chrom 17 

The genes from the second gene group discussed above are preferably genes be* 
ing expressed in ail stages of the biological condition, such as all Dukes stages of a 
1 5 colorectal tumor, to be used for determining the biological condition. 

Nuoiber of genes 

As discussed above, it is possible to use a single gene approach determining the 
20 expression of one of the genes only. In order to determine the biological condition of 
the tissue. This is particularly relevant for genes mentioned in the tables in Experi- 
ments, since these genes have been determined as having a strong indicativity per 
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gene. It is however preferred that expression from at least one gene from the first 
group as well as expression from one gene from the second group is determined to 
obtain a more statistically significant result, that is more independent of the expres- 
sion level of the individual gene. In a preferred embodiment expression from more 
5 genes from both groups are determined, such as determination of expression from 
at least two genes from either of the gene groups, such as determination of expres- 
sion from at least three genes from either of the gene groups, such as determination 
of expression from at least four genes from either of the gene groups, such as de- 
termination of expression from at least five genes from either of the gene groups, 
10 such as determination of expression from at least six genes from either of the gene 
groups, such as determination of expression from at least seven genes from either 
of the gene groups. 

A pattern of characteristic expression of one gene can be useful in characterizing a 
15 cell type source or a stage of disease. However, more genes may be usefully 
analyzed. Useful pattems include expression of at least one, two, three, five, ten, 
fifteen, twenty, twenty-five, fifty, seventy-five, one hundred or several hundred 
informative genes. 

20 Expression level 

Using the results provided in the accompanying figures and tables, a gene is 
indicated as being expressed if an intensity value of greater than or equal to 20 is 
shown. Conversely, an intensity value of less than 20 Indicates that the gene is not 

25 expressed at>ove background levels. Comparison of an expression pattern to 
another may score a change from expressed to non-expressed, or the reverse. 
Alternatively, changes in intensity of expression may be scored, either increases or 
decreases. Any statistically significant change can be used. Typically changes which 
are greater than 2-fold are suitable. Changes which are greater than 5-foid are 

30 highly significant. 

The present invention in particular relates to methods using genes wherein the ratio 
of the expression level in normal tissue to biological condition tissue for suppressor 
genes or vice versa of the expression level in biological condition tissue to normal 
35 tissue for condition genes is as high as possible, such as at least two-fold change in 
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expression, such as at least three-fold, such as at least four fold, such as at least 
five fold, such as at least six fold, such as at least ten fold, such as at least fifteen 
fold, such as at least twenty fold. 

5 Stages and grades 

Stage of a colorectal tumor indicates how deep the tumor has penetrated. 
Superficial tumors are termed Dukes A and Dukes B and Dukes C are used to 
describe increasing degrees of penetration into the muscle. The grade of a 
10 colorectal tumor is expressed on a scale of l-IV (1-4). The grade reflects the 
cytological appearance of the cells. Grade I cells are almost normal. Grade II cells 
are slightly deviant. Grade III ceils are cleariy abnormal. And Grade IV cells are 
highly abnormal. 

15 It is important to classify the stage of a cancer disease, as superficial tumors may 
require a less intensive treatment than invasive tumors. We have therefore used the 
expression level of genes to identify genes whose expression can be used to iden- 
tify a certain stage of the disease. We have divided these "Classifiers" into those 
which can be used to identify Dukes A, B, C, and D stages. We expect that meas- 

20 uring the transcript level of one or more of these genes will lead to a classifier that 
can add supplementary information to the information obtained from the pathological 
Dukes classification. For example we believe that gene expression levels that signify 
a Dukes C will be unfavourable to detect in a Dukes A tumor, as they may signal 
that the Dukes A tumor has the potential to become a Dukes C tumor. The opposite 

25 is probably also true, that an expression level that signify Dukes A will be favorable 
to have in a Dukes C tumor. In that way independent information may be obtained 
from Dukes pathological classification and a classification based on gene expres- 
sion levels is made. 

30 Thus, in one embodiment the invention relates to a method as described above fur- 
ther comprising the steps of determining the stage of a biological condition in the 
animal tissue, comprising assaying a third expression level of at least one gene from 
a third gene group, wherein a gene from said second gene group, in one stage, is 
expressed differently from a gene from said third gene group. 

35 
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In another aspect the invention relates to method of determining the stage of a bio- 
logical condition in animal tissue, 

comprising collecting a sample comprising cells from the tissue, 

5 

assaying the expression of at least a first stage gene from a first stage gene 
group and/or at least a second stage gene from a second stage gene group, 
wherein at least one of said genes is expressed in said first stage of the condi- 
tion in a higher amount than in said second stage, and the other gene is a ex- 
1 0 pressed in said first stage of the condition in a lower amount than in said second 

stage of the condition, 

correlating the expression level of the assessed genes to a standard level of ex- 
pression determining the stage of the condition. 

15 

The method of determining the stage of a tumor may be combined with determina- 
tion of the biological condition or may t>e an independent method as such. The dif- 
ference in expression level of a gene from one stage to the expression level of the 
gene in another group is preferably at least two-fold, such as at least three-fold. 

20 

Thus, the invention relates to a method of determining the stage of a colorectal tu- 
mor, wherein the stage is selected from colon cancer stages Dukes A, Dukes B, 
Dukes C, and Dukes D, comprising assaying at least the expression of Dukes A 
stage gene from a Dukes A stage gene group, at least one Dukes B stage gene 
25 from a Dukes B stage gene group, at least the expression of Dukes C stage gene 
from a Dukes C stage gene group, and/or at least one Dukes D stage gene from a 
Dukes D stage gene group, wherein at least one gene from each gene group is ex- 
pressed in a significantly different amount in that stage than in one of the other 
stages. 

30 

The genes selected may be a gene from each gene group being expressed in a 
significantly higher amount in that stage than in one of the other stages, such as: 

a Dukes A stage gene selected individually from any gene comprising a sequence 
35 as identified below as EST 
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RCIAA599199jt 




ALU seq. 


It w n 1 ,<**■ 




unnamed nrotein nroduct 

U III lull 1 w VI fl W fcw 11 1 ■'I %^VIU^^% 

BAA91641,chrom10 


RC H91325 s at 




aldolase B; aldolase B (aa 1- 
364); chrom 9 


RC_N51709iat 




chrom X 


RC_N726iOi3it ^• 






RCiN69263_at 




chrom 10: AK026414 clone 
(only 108 nt horn) 


RC_T15817_f_at 




iNOS, inducible nitric oxide 
synthase 



RC_F03077_f 

RC_AA599199 
RC_AA207015 

RC_AA234916 

RC_N92239_a 

RC_N93958_s 

U95301_at 

RC_AA426330 

RC_AA024658 

RC H88540_a 



chromosome 17, clone 

hRPC.15 

Alu seq 

clone RP4-733M16 on chromo- 
some 1 p36.1 1 -36.23 
chromosome 19 clone CTC- 
461 H2 

Wnt inhibitory factor-1 (WIF-1), 
chromosome 1 2 
phospholipase A2, group X 
(PLA2G10), 

phospholipase A2, group X 
(PLA2G10), 

chromosome 17, clone 
hRPC.1110_E_20 
clone SCb-254N2 
(UWGC:rg254N02) from 6p21 
heat shock protein 90, 1q21.2- 
q22 



or any gene comprising a sequence as identified below 



D87444_at 
U18291_at 
L76568_xpt3_f_at 



U45328_s_at 

Z14982_rna1_at 

AD000092_cds7_s 
_at 



D86973_at 
X81636_at 



Human mRNA for KIAA0255 "gene," complete cds 
Human CDC16Hs "mRNA." complete cds 
S26 from Homo sapiens excision and cross linl< repair protein 
(ERCC4) "gene," complete genomic sequence. /gb=L76568 
/ntype=DNA /annot=exon 

"Human ubiquitin-conjugating enzyme (UBE2I) ""mRNA,"" complete 
cds" 

H. sapiens gene for major histocompatibility complex encoded protea- 
some subunit LMP7. 

RAD23A gene (human RA023A homolog) extracted from Homo 

sapiens DNA from chromosome 19p13.2 cosmids "R31240," R30272 

and R28549 containing the "EKLF." "GCDH," "CRTC," and RAD23A 

"genes," genomic sequence 

Human mRNA for KIAA0219 "gene," partial cds 

H.sapiens clathrin light chain a gene 
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M59916_at 
X85781_s_at 

M57731_s_at 

U49188_at 

X53800_s_at 

U56816_at 

HG1067- 



Human acid sphingomyelinase (ASM) "mRNA," complete cds 
"H. sapiens NOS2 ""gene." exon 27 /gb=X85781 /ntype=DNA 
/annot=exon" 

"Human gro-beta ""mRNA,"" complete cds" 

Human placenta (Diff33) "mRNA," complete cds 

Human mRNA for macrophage inflammatory protein-2beta (MIP2beta) 

Human kinase Mytl (Mytl) "mRNA," complete cds. 

Mucin (Gb:M22406) 



Human migration inhibitory factor-related protein 8 (MRP8) 
"qene," complete cds 


M21005 


Human acyloxyacyl hydrolase "mRNA." complete cds 


M62840 


Human PEP19 (PCP4) "mRNA." complete cds 


U52969 


H.saplens Humig mRNA 


X72755 


H.sapiens PISSLRE mRNA 


X78342 


H.saplens mRNA for twist "protein," partial. /gb=Yl1180 
/ntvpe=RNA 


Y11180 


Human mRNA for TGF-beta superfamily "protein," com- 
plete cds 


AB000584 


Human mRNA for °MSS1 complete cds 


D11094 


Human complement factor B "mRNA," complete cds 


LI 5702 


"Homo sapiens GTP-binding protein {RAB2) ""mRNA,"* 
complete cds" 


M28213 


Human translational initiation factor 2 beta subunit (elF-2- 
beta) "mRNA," complete cds 


M29536 


Human E16 "mRNA," complete cds 


M80244 


IEX-1=radiation-inducibIe immediate-early gene "(human," 
"placenta." mRNA "Partial." 1223 ntl 


S81914 


Human CDC16Hs "mRNA." complete cds 


U18291 


Human OD96 "mRNA," complete cds 


U21049 


Human (memc) "mRNA," 3'UTR. /gb=U30999 /ntvpe=RNA 


U30999 


"Human ubiquitin-conjugating enzyme (UBE2I) ""mRNA,"" 
complete cds" 


U45328 


"Human fetal brain glycogen phosphorylase B ""mRNA,"" 

complete cds" 


U47025 


"Human BTG2 (BTG2) ""mRNA."" complete cds" 


U72649 


Human jun-B mRNA for JUN-B protein 


X51345 


Human chaperonin 10 "mRNA," complete cds 


U07550 


H.sapiens RING4 cDNA 


X57522 


H.sapiens qenes TAP1 , TAP2, LMP2, LMP7 and DOB. 


X66401 


H.saoiens mRNA for alpha 4 protein 


Y08915 


Homo sapiens interleukin-1 receptor-associated kinase 
ORAK) "mRNA." complete cds 


L76191 


"Human von Willebrand factor ""mRNA,"" 3' end" 


Ml 0321 


Human chromosome segregation gene homolog CAS 
"mRNA," complete cds 


U33286 


Human Bruton's tyrosine kinase-associated protein-135 
"mRNA." complete cds. 


U77948 


'Human KH type splidng regulatory protein KSRP 

""mRNA,"" complete cds." 


U94832 


H.sapiens ADE2H1 mRNA showing homologies to SAICAR 


X53793 
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synthetase and AIR carboxylase of the purine pathway (EC 

"6,3,2.6/ EC 4.1.1,21) [ | 

a Dukes B stage gene is selected individually from any gene comprising a sequence 
as identified below 



RC_T67463_s_at 




cathepsin 02; X; K 


RC_W94688_at 




perilipin 


RC_AA126743_at 




Z97200 PAC chrom 1q24: 
PMX1 homeobox gene 


RC^AA236547_at 




no homology 


RC_AA255567_at 




angiopoietin-reiated protein-2; 
angiopoietin-like 2 


RCIAA421256_at 






RClAA386386^s 


PPPP 
P 




RC_AA452549_at 


pppp 
P 


PR0 1659; hypothetical protein 
chrom 1 1 



M63262_at 

R67290_at 
N36619_at 
L19161_at 

RC_AA496035 
L29217_s_at 
RC_W73194_a 
RC_N69507„a 

RC_H15814_s 
M84526 at 



5-lipoxygenase activating protein (FLAP), 
13q12 

Intierleukine 14 

translation initiation factor 2. subunit 3", 
Xp22,2-22.1 

Chromosome 1? (TIGR) 
CDC-like kinase 3 (CLK3), 15q24 
Denmatoponin, 1q12-q23 
hypothetical protein PR01847 (Alu accor- 
ding to TIGR) 

adipose most abundant gene transcript 1 
D component of complement (adipsin) 



or any gene comprising a sequence as identified below 



U57316_at Human GCN5 (hGCN5) "gene," complete cds 
X66839_at H.sapiens MaTu MN mRNA for p54/58N protein 

J04599_at Human hPGI mRNA encoding bone small proteoglycan I "(biglycan)/ com- 
plete cds 

X57579_s_at H.sapiens activin beta-A subunit (exon 2) 

J02874_at Human adipocyte lipid-binding "protein," complete cds 

Ml 1749_at Human Thy-1 glycoprotein "gene," complete cds 

U06863_at Human follistatin-related protein precursor "mRNA," complete cds 

U51010_s_at "Human nicotinamide N-methyltransferase ""gene,"" exon 1 and 5* flanking 

region. /gb=U51 01 0 /ntype=DNA /annot=exon" 
U08021„at "Human nicotinamide N-methyltransferase (NNMT) ""mRNA,"" complete 

cds" 

HG3044- """Fibronectin,"" Alt. Splice 1 " 
HT3742_s.at 
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X02761_s.at 

X02544_at 

M62505_at 

J05070_at 

U16306_at 

M14218_at 
L77567_s_at 

M63391„ma1 

_at 

D13643_at 



Human mRNA for fibronectin {FN precursor) 
Human mRNA for alpha1-acid glycoprotein (orosomucoid) 
Human C5a anaphylatoxin receptor "mRNA," complete cds 
Human type IV collagenase "mRNA," complete cds 

Human chondroitin sulfate proteoglycan versican VO splice-variant precursor 

peptide "mRNA," complete cds 

Human argininosuccinate lyase "mRNA," complete cds 

"Homo sapiens mitochondrial citrate transport protein (CTP) ""mRNA,"" 3' 

end" 

Human desmin gene, complete cds. 

Human mRNA for KIAA0018 "gene," complete cds 



Human adipocyte lipid-bfnding "protein." complete cds 


J02874 


Human A1 protein "mRNA," complete cds 


U29680 


Human LGN protein "mRNA." complete cds 


U54999 


Human skeletal muscle LIM-protein SLIM2 "mRNA," partial 
cds 


U60116 


Human mRNA for alpha1-acid glycoprotein (orosomucoid) 


X02544 


Human mRNA for fibronectin receptor alpha subunit 


X06256 


H.sapiens P1-Cdc21 mRNA 


X74794 


H.sapiens mRNA for fibulin-2 


X82494 


H.sapiens 5T4 gene for 5T4 Oncofetal antigen 


Z29083 


Homo sapiens mRNA for osteoblast specific factor 2 (OSF- 
2os) 


D13666 


Mac25 


HG987-HT987 


"Human lysozyme ""mRNA,"" complete cds with an Alu 
repeat in the 3* flank" 


J03801 


Human metalloproteinase (HME) "mRNA," complete cds 


L23808 


Human alpha-1 collagen type IV gene, exon 52. 


M26576 


Human lumican "mRNA." complete cds 


U21128 


Human mRNA for fibronectin (FN precursor) 


X02761 


Human mRNA fragment for elongation factor TU (N- 
terminus). /gb=X03689 /ntype=RNA 


X036o9 


Human mRNA for type IV collagen alpha -2 chain 


X05610 


Human mRNA for collagen VI alpha-1 C-terminal globular 
domain 


X15880 


"H.sapiens," gene for Membrane cof actor protein 


X59405 


H.sapiens SOD-2 gene for manganese superoxide dismu- 
tase. /gb=X65965 /ntype=DNA /annot=exon 


X65965 


H.sapiens NMB mRNA 


X76534 


H.sapiens vimentin gene 


Z19554 


Human chaperonin 10 "mRNA," complete cds 


U07550 


H.sapiens RING4 cDNA 


X57522 


H.sapiens genes TAP1. TAP2. LMP2, LMP7 and DOB. 


X66401 


H.sapiens mRNA for alpha 4 protein 


Y08915 


Homo sapiens interleukin-1 receptor-associated kinase 
(IRAK) "mRNA." complete cds 


L76191 


"Human von Willebrand factor ""mRNA,"" 3* end" 


Ml 0321 


Human chromosome segregation gene homolog CAS 
"mRNA," complete cds 


U33286 
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Human Bruton's tyrosine kinase-associated protein-135 
"mRNA," complete cds. 


U77948 


"Human KH type splicing regulatory protein KSRP 
""mRNA."" complete cds." 


U94832 


H.sapiens ADE2H1 mRNA showing homologies to SAICAR 
synthetase and AIR carboxylase of the purine pathway (EC 
"6.3.2.6." EC 4.1.1.21) 


X53793 


"""Globin,"" Beta" 


HG1428- 
HT1428 


"Human alpha-1 collagen type 1 ""gene." 3' end" 


M55998 


H.sapiens mRNA for SOX-4 protein 


X70683 


"Human mRNA for collagen binding protein "2,"" complete 
cds" 


D83174 


Human SPARC/osteonectin "mRNA," complete cds 


J03040 


Human PRAD1 mRNA for cyclin 


X59798 



a Dukes C stage gene is selected individually from any gene comprising a sequence 
as identified below 



RC_D45556_at 




chrom 15; AL390085 done 


RC_W86214_at 






RC2AA0394392S- 

Jat'. ■• 




novel gene Kl A A0 134 protein 
19q13.3 


RC_AA128935_at 






RC_AA134158_s 

_at 




class 1 homeodomain; homeo- 
box protein, chrom 7 


RC_i5^A232646^at 




chrom 17, AF266756 sphingo- 
sine kinase (SPHK1 


RCJAA401184lat 




no homology 


RC_AA436840_at 






RC_AA488655_at 






RC_AA181902_at 


PPPP 
P 


AC007201 on chrom 19 (only 
80nt horn) 



RC_AA1 22350 
AA374109_at 

RC_AA621755 
RC_AA442069 
RC_T40767_a 
RC_AA488655 
RC_AA398908 
RC_AA447764 

RC_N69136_a 



chromosome 8 

spondin 2, extracellular matrix 

protein, chromosome 4 

transcription factor Dp-2, 3q23 

sodium channel 2, 12q12 

chromosome 19 

Mus? 

hypothetical protein, chromosome 
4 



or any gene comprising a sequence as identified below 



M20681_at Human glucose transporter-like protein-ill "(GLUT3)." complete cds 
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D50914_at 
L37362.at 

X66114_ma1 

M32053_at 
Y00787_s_at 

U64444 at 



Human mRNA for KIAA0124 "gene," partial cds 

Homo sapiens (clone d2-1 15) kappa opioid receptor (OPRK1) 

"mRNA." complete cds 

H.sapiens gene for 2-oxoglutarate carrier protein. 

Human HI 9 RNA "gene," complete cds (spliced In siltco) 

Human mRNA for MDNCF (monocyte-derived neutrophil chemotactic 

factor) 

Human ubiquitin fusion-degradation protein (UFD1 L) "mRNA," com- 
plete cds 

H.sapiens mRNA for DNA binding protein A variant 
H.sapiens uPA gene 



X95325_s_at 
X02419„ma1 
_s_at 

X57522_at H.sapiens RING4 cDNA 

AB001325_at Human AQP3 gene for aquaporine 3 (water "channel)." partail cds 
AB002315_at Human mRNA for KIAA0317 "gene," complete cds. /gb=AB002315 
/ntype=RNA 

L12760_s_at "Human phosphoenolpyruvate carboxykinase (PCK1) ""gene,"" com 
piete cds with repeats" 



Ribosomat Protein L39 Homolog 


HG2874- 
HT3018 


Homo sapiens (clone d2-115) kappa opioid receptor 
(OPRK1) "mRNA." complete cds 


L37362 


Human kell blood group protein mRNA 


M64934 




U73167 


Human cancellous bone osteoblast mRNA for serin pro- 
tease with IGF-binding "motif." complete cds 


D87258 


Human interferon-inducible protein 27-Sep "mRNA," com- 
plete cds 


J04164 


"Human sickle cell beta-globin ""mRNA,"" complete cds" 


M25079 




M29277 


"Human spermidine synthase ""mRNA,"" complete cds" 


M34338 


Human copine 1 "mRNA," complete cds 


U83246 


""■Globin,"" Beta" 


HG1428- 
HT1428 


"Human alpha-1 collagen type 1 ""gene."" 3' end" 


M55998 


H.sapiens mRNA for SOX-4 protein 


X70683 


"Human mRNA for collagen binding proteiri ""2."" complete 

cds" 


D83174 


Human SPARC/osteonectin "mRNA," complete cds 


J03040 


Human PRAD1 mRNA for cyclin 


X59798 



a Dukes D stage gene is selected individually from any gene comprising a sequence 
as identified below 



RC_N91920_at 



AAAA 
P 



chrom 16p12'p11.2 ; 
XN_007994 retinoblastoma 
binding protein 
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RC_AA621601^at 



AAAA 
P 



chrom 17 XM_009868 RAB36 
ARS oncogene family 



RC_AA121433 
RC_N91920_a 

RC_AA621601 

RC.AA454020 

RC_Z39652_a 



Axin, chromosome 16 

RB protein binding protein, 

chromosome 16 

GTP-blnding protein Rab36, 

chromosome 17 

NADPH quinone oxidoreducta- 

se homolog; p53 Induced, 

chromosome 2 

API\A-1 gene, chromosome 18 



or any gene comprising a sequence as identified below 



XI 7644_s_ Human GST1 -Hs mRNA for GTP-binding protein 
at 

Y1 281 2_at H.sapiens RFXAP mRNA 
X60486_at H.sapiens H4/g gene for H4 histone 
X52221_at H.sapiens ERCC2 "gene," exons 1 & 2 (partial) 
L06175_at Homo Sapiens P5-1 "mRNA," complete cds 
Z48481_at H.sapiens mRNA for membrane-type matrix metallopro- 
teinase 1 

X54232_at Human mRNA for heparan sulfate proteaglycan (glypican) 
L0801 0_at "Homo sapiens reg gene ""homologue,"" complete cds" 
L27706_at Human chaperonin protein (Tcp20) gene complete cds 
L15533_rna Homo sapiens pancreatits-associated protein (PAP) gene, 
1 _at complete cds. 

X51 408_at Human mRNA for n-chimaerin 

K02765_at Human complement component C3 "mRNA," alpha and beta 
"subunlts," complete cds 



Homo sapiens FRG1 "mRNA," complete cds 


L76159 


Human cyclin protein "gene," complete cds 


ry/115796 


Human U2 small nuclear RNA-associated B" antigen 
"mRNA." complete cds 


M15841 


Human mRNA export protein Rae1 (RAE1) "mRNA," com- 
plete cds. 


U84720 


Human protease-activated receptor 3 (PAR3) "mRNA." 
complete cds. 


U92971 


H.sapiens mRNA for mediator of receptor-induced toxicity 


X84709 


H.sapiens RFXAP mRNA 


Y12812 


Human mRNA for "Qipl ," complete cds 


AB002533 


Human mRNA for transferrin receptor 


X01060 


"metastasis-associated gene ""[human,"" highly metastatic 
lunq cell subline ""Anip[9371."" mRNA ""Partial."" 978 nt]" 


S79219 



The genes selected may be a gene from each gene group being expressed in a 
significantly lower amount in that stage than in one of the other stages, such as: 
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a Dukes A stage gene is selected individually from any gene comprising a sequence 
as identified below 



RC_N32411_f_at 


PAPP 
p 


Myc-associated zinc-finger 
nratein of human islst' chrom 
16 


RC_AA243858_at 


PAPP 
P 


KIAA0882 protein 


RC_AA486283_at 


PAPP 
p 


ras-like protein; ras-related C3 

Kw^tiiliniim tnyin ^uhQtrAtA* 
dJ20J23 




PAPP 

P 


V»l If villi • Wj f\9AyA\ 9 *t%JV fJt XJIGH i 


RC 1-154088 s at 


PPPP 
P 


ribosomal orotein L41 


RC_H59052_f_at 


PPPP 
P 


fungal sterol-C5-desaturase 
homolog: ORF; thymosin beta- 
4 


RC_R49198_s_at 


PPPP 

P 




RC_T73572_f_at 


PPPP 
P 


ferritin L-chain; Lapoferritin 


RC_AA477483_at 


PPPP 
P 


no matching est 



5 or any gene comprising a sequence as identified below 



Homo sapiens SKBIHs "mRNA," complete cds. 
/gb=AF01 591 3 /ntype=RN A 
Mucin (Gb:M22406) 

Human platelet activating factor "acetylhydrolase," brain 
"Isoform," 45 kDa subunit (LIS1) gene 
Homosapiens ERK activator kinase (IMEK2) mRNA 
Human 20-kDa myosin light chain (IMLC-2) "mRNA," 
complete cds 

H.sapiens lysosomal acid phosphatase gene (EC 3.1 .3.2) 

Exon 1 (and joined CDS). 

Human mRNA for matrix Gla protein 

H.sapiens mRNA for diacylglycerol kinase 

Human heat shock protein (hsp 70) gene, complete cds. 

Human TRPIUI-2 protein gene 



AF015913 

HG1067- 

HT1067 

U72342 

LI 1285 
J02854 

XI 5525 

X53331 
X62535 
M11717 
M63379 



a Dukes B stage gene is selected individually from any gene comprising a sequence 
10 as identified below 
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RC_D59847_at 


PPAP 
P 


proSAAS; granin-like neuroen- 
docrine peptide precursor 


RC_F05038_at 


PPAP 
P 


polyamine modulated factor-1; 
polyamine modulated factor 1 


RC_N41Q59_at 


PPAP 
P 


chrom 3 


RC_T2346q^at , 


PPAP 
P 


chrom 3; IFNAR2 21q22. 1 1 


RC_W42789-at; 


PPAP 
P 


chrom 8 AF266037 C80HF4 
protein (C80RF4) chrom 8 
ORF 


RC AA460017J^ 

at • 1-? 


PPAP 
P 


BAC Clarke chrom 16 


RC_AA482127_at 


PPAP 
P 


KIAA1 142 protein 


RC_AA504806_at 


PPAP 

P 


chrom 2 AF052107 clone 
23620 mRNA sequence 


RC_T90037_at 


PPPP 
P 


unnamed protein product, 
chrom 4 


RC_AA432130_at 


PPPP 
P 


KIAA0887 protein, chrom 12 



or any gene comprising a sequence as identified below 



Human gene for mitochondrial acetoacetyl-CoA thiolase 
Human mRNA for transcription factor "AREB6," complete 
cds 

Human mRNA for KIAA0248 "gene/ partial cds 

Homo sapiens (clone CC6) NADH-ubiquinone oxidoreduc- 

tase subunit "mRNA," 3* end cds 

Human ptiosphogiucomutase 1 (PGM1) "mRNA," com- 
plete cds 

Homo sapiens guanylin "mRNA/' complete cds 
"Human trans-Golgi p230 ""mRNA/" complete cds" 
H.sapiens mRNA for vacuolar proton "ATPase," subunit D 
H.sapiens mRNA for 3-hydroxy-3-mettiylglutaryl coen- 
zyme A synthase 

Human mRNA for KIAA0018 "gene/ complete cds 

"Mucin ""1/" ""Epithelial."" Alt Splice 9" 



H.sapiens mRNA for L-3-hydroxyacyi-CoA dehydrogenase 



D10511 
D15050 

D87435 
L04490 

M83088 

M97496 
U41740 
X71490 
X83618 

D13643 
HG371- 
HT26388 
X96752 



a Dukes C stage gene is selected individually from any gene comprising a sequence 
as identified below 



RC_N30231_at 



PPPA 
P 



Lsm4 protein; U6 snRNA- 
associated Sm-like protein 
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LSm4; glycfne-rich protein 


RC W73790 f at 


PPPA 
P 


immunogtobulin-related pro- 
tein 14.1; lambda L-chain C 
region; omega protein, chrom 
22 


RC AA412184 at 

j>. & -■ 


PPPA 
P 


chrom 1o36' d89060 dolichvl- 
diphosphooligosaccharide' 
protein glycosyltransferase 


RC_AA521303_at 


PPPA 
P 


methionine adenosyltransfera- 
se regulatory beta subunit; 
dTDP-4-keto-6-deoxy-D- 
glucose 4-reductase, chrom 5 


RC_AA461174_at 


PPPP 
P 


8p21.3-p22 AB020860 anti- 
oncoqene 


AA393432_s_at 


PPPP 
P 


chrom 2, Unknown; unnamed 
protein product AAD20029 



or any gene comprising a sequence as identified below 



Homo sapiens colon mucosa-associated (DRA) L02785 
"mRNA," complete cds 

Human Ig J chain gene Ml 2759 

Human selenium-binding protein (hSBP) *'mRNA," U29091 
complete cds. /gb=U29091 /ntype=RNA 

H.sapiens mRNA for sigma 3B protein X99459. 
Human ERK1 mRNA for protein serine/lhreonine kina- X60188 
se 

Human mRNA for mitochondrial S-oxoacyl-CoA "thio- D16294 
lase," complete cds 

"Biliary ""Glycoprotein,"" Alt. Splice ""5."" A" HG2850- 

HT4814 

Human AQP3 gene for aquaporine 3 (water "channel)," AB001 325 
partail cds 

Human CD1 4 mRNA for myelid cell-specific leucine-rich XI 3334 
glycoprotein 

Human thioredoxin "mRNA," nuclear gene encoding mito- U78678 
chondrial "protein," complete cds 

Human mitochondrial ATPase coupling factor 6 subunit M37104 
(ATP5A) "mRNA," complete cds 

"Human MHC class II HLA-DP light chain ""mRNA."" com- M57466 
plete cds" 

Human mRNA for early growth response protein 1 X52541 
(hEGRI) 

Human mRNA for mitochondrial 3-ketoacyl-CoA thiolase D16481 
beta-subunit of trifunctional "protein," complete cds 
Homo sapiens laminin-related protein (L^mA3) "mRNA," L34155 
complete cds 

H.sapiens mRNA for selenoprotein P Z1 1793 

Human hkf-1 "mRNA." complete cds D76444 
Homo sapiens nuclear domain 10 protein (ndp52) "mRNA," U22897 
complete cds 
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Human XI 04 "mRNA." complete cds L27476 

H. sapiens cDNA for RFG X77548 

H.sapiens mRNA for Progression Associated Protein Y07909 

Human liver "2,4-dienoyl-CoA" reductase "mRNA," com- U49352 
plete cds 

Human A33 antigen precursor "mRNA," complete cds U79725 

H.sapiens pS2 protein gene X52003 

Human RASF-A Pl^2 "mRNA," complete cds M22430 

Homo sapiens psti mRNA for pancreatic secretory inhibitor Y00705 
(expressed in neoplastic tissue). 

Human CO-029 M35252 



a Dukes D stage gene is selected individually from any gene comprising a sequence 
as identified below 

5 



RC R72886 s at 


PPPP 
A 


KIAA0422" adenvlvl cvclase 
type VI, chrom 12 


RC AA026030 at 


PPPP 
A 


chrom 1 


RC Z39006 at 


PPPP 
A 


hypothetical protein, chrom 17 


RC AA435908 at 


PPPP 
A 


chrom 19; acOl 1491 clone and 
20 nt horn. RAB2, HAS onco- 
gene family 


RC_AA057829_s 
_at 


PPPP 
A 


growth-arrest-specific protein; 
growth arrest-specific 6; AXL 
stimulatory factor, chrom 13 


RC_R72087_at 


PPPP 
A 


chrom 5 EST; horn to chrom 
20 AL356652 clone 


RC_H04242_at 


PPPP 
A 


ras related protein RabSb; 
RAB5B, member RAS onco- 
gene family 


RC_R97304_f_at 


PPPP 
A 


HLA-drb5; cell surface gly- 
coprotein; MHC HLA-DR-beta 
chain precursor chrom 6 


Rp^N486p9^at ., : 


PPPP 
A 


chrom 1 1; AC004584 chrom 
17 


RC^W868S0_f_at 


PPPP 
A 


chrom 22 ? X96924 mito- 
chondrial citrate tranbsport 
region 


RC_AA130603_at 


PPPP 
A 


ak024908 clone 


RCiAA479610_at 


PPPP 
A 


singleton ak025344 clone 


RC_AA490593_i_ 
at 


PPPP 
A 


chrom 17 7 Synaptobrevin2 
(VAMP2) AF135372 


RC_AA054321_s 
iat 


PPPP 
A 


6p21 HLA class i region; 
AC004202 clone 
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RC_D60328_at 


PPPP 
P 


chrom 6, unknown; ring finger 
protein 5 


BC H96850 at 


PPPP 
P 


oligosaccharyitransferase 
CI89060 1p36. 1 (also C-class) 


RC_AA127444_at 


PPPP 
P 


chrom 1 no homology 


RC_AA242824_at 


PPPP 
P 


chrom 1 1; ac005233 PAC clo- 
ne chrom 22 


AA405775_s_at 


PPPP 
P 


similar to CAA16821 
(PlD:g3255952) 



or any gene comprising a sequence as identified below 

Human complement component C3 "mRNA,*" alpha and K02765 
beta "subunits/ complete cds 

H.sapiens mRNA for adenosine "triphosphatase," cal- Z69881 
cium 

Human skeletal muscle LIM-protein SLIMI "mRNA/ com- U601 1 5 
plete cds 

Human platelet-derived growth factor receptor alpha M21574 
(PDGFRA) -mRNA." complete cds 

Human mRNA for Kl AA0247 "gene," complete cds D87434 

Human mRNA for KIAA0171 "gene." complete cds D79993 

Human Down syndrome critical region protein (DSCR1) U28833 
"mRNA," complete cds 

Human Ki nuclear autoantigen "mRNA," complete cds U1 1292 



5 Expression patterns 

The objects of the invention are achieved by providing one or more of the 
embodiments described below. In one embodiment a method is provided of 
determining an expression pattern of a cell sample preferably independent of the 
10 proportion of submucosal, muscle and connective tissue cells present. Expression is 
determined of one or more genes in a sample comprising cells, said genes being 
selected from the same genes as discussed above and shown in the tables of the 
Examples. 

15 It is an object of the present that characteristic pattems of expression of genes can 
be used to characterize different types of tissue. Thus, for example gene expression 
patterns can be used to characterize stages and grades of colorectal tumors. 
Similariy, gene expression pattems can be sued to distinguish cells having a 
colorectal origin from other cells. Moreover, gene expression of cells which routinely 

20 contaminate colorectal tumor biopsies has been identified, and such gene 
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expression can be removed or subtracted from patterns obtained from colorectal 
biopsies. Further, the gene expression patterns of single-cell solutions of colorectal 
tumor cells have been found to be far freer of interfering expression of 
contaminating muscle, submucosal, and connective tissue cells that biopsy 
5 samples. 

The one or more genes exclude genes which are expressed in the submucosal, 
muscle, and connective tissue. A pattern of expression is formed for the sample 
which is independent of the proportion of submucosal, muscle, and connective 
1 0 tissue cells in the sample. 

In another aspect of the invention a method of determining an expression pattern of 
a cell sample is provided. Expression is determined of one or more genes in a 
sample comprising cells. A first pattern of expression is thereby formed for the 
15 sample. Genes which are expressed in submucosal, muscle, and connective tissue 
cells are removed from the first pattern of expression, forming a second pattern of 
expression which is independent of the proportion of submucosal, muscle, and 
connective tissue cells in the sample. 

20 Another embodiment of the invention provides a method for determining an 
expression pattern of a colorectal mucosa or colorectal cancer cell. Expression is 
determined of one or more genes in a sample comprising colorectal mucosa or 
colorectal cancer cells; the expression determined forms a first pattern of 
expression. A second pattern of expression which was formed using the one or 

25 more genes and a sample comprising predominantly submucosal, muscle, and 
connective tissue cells, is subtracted from the first pattern of expression, forming a 
third pattern of expression. The third pattern of expression reflects expression of the 
colorectal mucosa or colorectal cancer cells independent of the proportion of 
submucosal, muscle, and connective tissue cells present in the sample. 

30 

Diagnosing 

In another embodiment of the invention a method is provided of detecting an 
invasive tumor in a patient. A marker is detected in a sample of a body fluid. The 
35 body fluid is selected from the group consisting of blood, plasma, serum, faeces. 
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mucus, sputum, cerebrospinal fluid and/or urine. The marker is an mRNA or protein 
expression product of a gene which is more prevalent in submucosal, muscle, and 
connective tissue than in the body fluid. An increased amount of the marker in the 
body fluid indicates a tumor which has become invasive in the patient. 

5 

In another aspect of the invention a method is provided for diagnosing a colorectal 
cancer. A first pattern of expression is determined of one or more genes in a colonic 
tissue sample suspected of being neoplastic. The first pattem of expression is 
compared to a second and third reference pattem of expression. The second pattem 
10 is of the one or more genes in normal colorectal mucosa and the third pattem is of 
the one or more genes in colorectal cancer. A first pattem of expression which is 
found to be more similar to the third pattem than the second indicates neoplasia of 
the colorectal tissue sample. 

15 According to yet another aspect of the invention a method is provided for predicting 
outcome or prescribing treatment of a colorectal tumor. A first pattem of expression 
is determined of one or more genes in a colorectal tumor sample. The first pattem is 
compared to one or more reference patterns of expression determined for colorectal 
tumors at a grade between I and iV. The reference pattem which shares maximum 

20 similarity with the first pattern is identified. The outcome or treatment appropriate for 
the grade of tumor of the reference pattem with the maximum similarity is assigned 
to the colorecteal tumor sample. 

In another embodiment of the invention a method is provided for determining grade 
25 of a colorecteal tumor. A first pattern of expression is determined of one or more 
genes in a colorectal tumor sample. The first pattern is compared to one or more 
reference patterns of expression determined for colorectal tumors at a grade 
between I and IV. The grade of the reference pattern with the maximum similarity is 
assigned to the colorecteal tumor sample. 

30 

Yet another embodiment of the invention provides a method to determine stage of a 
colorectal tumor as described above. A first pattem of expression is determined of 
one or more genes in a colorectal tumor sample. The first pattem is compared to 
one or more reference patterns of expression determined for colorectal tumors at 
35 different stages. The reference pattern which shares maximum similarity with the 
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first pattern is identified. The stage of the reference pattern with the maximum 
similarity is assigned to the colorecteal tumor sample. 

In still another embodiment of the invention a method is provided for identifying a 
5 tissue sample as colo-rectai. A first pattern of expression is determined of one or 
more genes in a tissue sample. The first pattern is compared to a second pattem of 
expression determined obtained for normal mucosa cells. Similarity between the first 
and the second patterns suggests that the tissue sample is mucosa in its origin. This 
method being particularly useful when diagnosing metastasis possibly distant from 
10 its origin. 

Another aspect of the invention is a method to aid in diagnosing, predicting 
outcome, or prescribing treatment of a colorectal cancer. A first pattem of 
expression is determined of one or more genes in a first colorectal tissue sample. A 

1 5 second pattem of expression is detenmined of the one or more genes in a second 
colorectal tissue sample. The first colorectal tissue sample is a normal colorectal 
mucosa sample or an earlier stage or lover grade of colorectal tumor than the 
second colorectal tissue sample. The first pattem of expression Is compared to the 
second pattem of expression to identify a first set of genes which are increased in 

20 the second colorectal tissue sample relative to the first colorectal tissue sample and 
a second set of ^enes which are decreased in the second colorectal tissue sample . 
relative to the first colorectal tissue sample. Those genes which are expressed in 
submucosal, muscle or connective tissue are removed from the first set of genes. 
Those genes which are not expressed in submucosal, muscle, or connective tissue 

25 are removed from the second set of genes. 

Independence of submucosal, muscle and connective tissue 

Since a biopsy of the tissue often contains more tissue material, than the tissue to 
30 be examined, such as connective tissue, when the tissue to be examined is 
epithelial or mucosa, the invention also relates to methods, wherein the expression 
pattem of the tissue is independent of the amount of connective tissue in the 
sample. 
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Biopsies contain epithelial cells that most often are the targets for the studies, and in 
addition many other cells that contaminate the epithelial ceil fraction to a varying 
extent. The contaminants include histiocytes, endothelial cells, leukocytes, nerve cells, 
muscle cells etc. Micro dissection is the method of choice for DNA examination, but in 
5 case of expression studies this procedure is difficult due to RNA degradation during the 
procedure. The epithelium may be gently removed and the expression in the remaining 
submucosa and underlying connective tissue (the colon wall) monitored. Genes 
expressed at high or low levels in the colon wall should be interrogated when 
performing expression monitoring of the mucosa and tumors. A similar approach could 
10 be used for studies of epithelia in other organs. 

Normal mucosa lining the colon lumen from colons for colon cancer was scraped off. 
Then biopsies were taken from the denuded submucosa arxl connective tissue, 
reaching approximately 5 mm Into the colon wall, and immediately disintegrated in 
15 guanidinium isothiocyanate. Total RNA may be extracted, pooled, and poly(A)* mRNA 
may be prepared from the pool followed by conversion to double-stranded cDNA and 
in vitro transcription into cRNA containing biotin-labeled CTP and UTP. 

Genes that are expressed and genes that are not expressed in colon wall can both 
20 interfere with the interpretation of the expression in a biopsy, and should be 
interrogated when interpreting expression intensities in tumor biopsies, as the colon 
wall component of a biopsy varies in amount from biopsy to biopsy. 

When having detenmined the pattern of genes expressed In colon wall components 
25 said pattem may be subtracted from a pattern obtained from the sample resulting in a 
third pattem related to the mucosa (epithelial) cells. 

in another aspect of tiie invention a meUiod is provided for determining an 
expression pattern of a colorectal tissue sample independent of the proportion of 
30 submucosal, muscle and connective tissue celts present. A single-cell suspension of 
disaggregated colorectal tumor cells is Isolated from a colorectal tissue sample 
comprising colorectal tumor cells is isolated form a coloretal tissue sample 
comprising colorectal ceils, submucosal cells, muscle cells, and connective tissue 
cells. A pattem of expression is thus formed for the sample which is independent of 
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the proportion of submucosal, muscle, and connective tissue cells in the colorectal 
tissue sample. 

Yet another method relates to elimination mRNA from colon wall components before 
5 determining the pattem, e.g. by filtration and/or affinity chromatography to remove 
mRNA related to the colon wall. 

Detection 

10 Working with human tumor material requires biopsies, and working with RNA 
requires freshly frozen or immediately processed biopsies. Apart from the cancer 
tissue, biopsies do inevitably contain many different ceil types, such as cells present 
in the blood, connective and muscle tissue, endothelium etc. In the case of DNA 
studies, microdissection or laser capture are method of choice, however the 

15 time.dependent degradation of RNA makes it difficult to perform manipulation of the 
tissue for more than a few minutes. Furthermore, studies of expressed sequences 
may be difficult on the few cells obtained via microdissection or laser capture, as 
these may have an expression pattern that deviates from the predominant pattern in 
a tumor due to large intratumoral heterogeneity. 

20 

In the present context high density expression arrays may be used to evaluate the 
impact of colorectal wall components in colorectal tumor biopsies, and tested 
preparation of single ceil solutions as a means of eliminating the contaminants. The 
results of these evaluations permit us to design methods of evaluating colorectal 
25 samples without the interfering background noise caused by ubiquitous 
contaminating submucosal, muscle, and connective tissue cells. The evaluating 
assays of the invention may be of any type. 

While high density expression arrays can be used, other techniques are also 
30 contemplated. These include other techniques for assaying for specific mRNA 
species, including RT-PCR and Northern Blotting, as well as techniques for 
assaying for particular protein products, such as ELISA, Western blotting, and 
enzyme assays. Gene expression patterns according to the present invention are 
determined by measuring any gene product of a particular gene, including mRNA 
35 and protein. A pattern may be for one or more gene. 
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RNA or protein can be isolated and assayed from a test sample using any 
techniques known in the art. They can for example be isolated from fresh or frozen 
biopsy, from formalin-fixed tissue, from body fluids/ such as blood, plasma, serum, 
5 urine, or sputum. 

The data provided of expression for submucosal, muscle, and connective tissue can 
be used in at least three ways to improve the quality of data for a tested sample. 
The genes identified in the data as expressed can be excluded from the testing or 
10 from the analysis. Alternatively, the intensity of expression of the genes expressed 
in the submucosal, muscle, and connective tissue can be subtracted from the 
intensity of expression determined for the tests tissue. 

The data collected and disclosed here as ''connective tissue** is presumed to contain 
15 both muscle and submucosal gene expression as well. Thus it represents the 
composite expression of these cell types which can typically contaminate a 
colorectal biopsy. 

Detection of expression 

20 

Expression of genes may in general l>e detected by either detecting mRNA from the 
cells and/or detecting expression products, such as peptides and proteins. 

mRNA detection 

25 

The detection of mRNA of the invention may be a tool for determining the 
developmental stage of a celt type may be definable by its pattem of expression of 
messenger RNA. For example, in particular stages of ceils, high levels of ribosomal 
RNA are found whereas relatively low levels of other types of messenger RNAs may 

30 be found. Where a pattern is shown to be characteristic of a stage, a stage may be 
defined by that particular pattern of messenger RNA expression. The mRNA 
population is a good determinant of developmental stage, will be correlated with 
other structural features of the cell. In this manner, cells at specific developmental 
stages will be characterized by the intracellular environment, as well as the 

35 extracellular environment. The present invention also allows the combination of 
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definitions based, in part, upon antigens and. in part, upon mRNA expression. 
In one embodiment, the two may be combined in a single Incubation step. A 
particular Incubation condition may be found which is compatible with both 
hybridization recognition and non-hybridization recognition molecules. Thus, e.g., an 
5 incubation condition may be selected which allows both specificity of antibody 
binding and specificity of nucleic acid hybridization. This allows simultaneous 
performance of both types of interactions on a single matrix. Again, where 
developmental mRNA patterns are correlated with structural features, or with probes 
which are able to hybridize to intracellular mRNA populations, a cell sorter may be 
10 used to sort specifically those cells having desired mRNA population pattems. 

it is within the general scope of the present invention to provide methods for the 
detection of mRNA. Such methods often involve sample extraction, PGR 
amplification, nucleic acid fragmentation and labeling, extension reactions, 
15 transcription reactions and the like. 

Sample preparation 

The nucleic acid (either genomic DNA or mRNA) may be isolated from the sample 
20 according to any of a number of methods well known to those of skill in the art. One 
of skill will appreciate that where alterations in the copy number of a gene are to be 
detected genomic DNA is preferably isolated. Conversely, where expression levels 
of a gene or genes are to be detected, preferably RNA (mRNA) is isolated. 

25 Methods of isolating total mRNA are well known to those of skill in the art. In one 
embodiment, the total nucleic acid is isolated from a given sample using, for 
example, an acid guanidinium-phenol-chloroform extraction method and polyA.sup.+ 
mRNA is isolated by oligo dT column chromatography or by using (dT)n magnetic 
beads (see, e.g., Sambrook at a!.. Molecular Cloning: A Laboratory Manual (2nd 

30 ed,), Vols. 1-3. Cold Spring Harbor Laboratory, (1989), or Current Protocols in 
Molecular Biology, F. Ausubel et al., ed. Greene Publishing and Wiley-lnterscience, 
New Yori< (1987)). 

The sample may be from tissue and/or body fluids, as defined elsewhere herein. 
35 Before analyzing the sample, e.g., on an oligonucleotide array, it will often be 
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desirable to perform one or more sample preparation operations upon the sample. 
Typically, these sample preparation operations will include such manipulations as 
extraction of intracellular material, e.g., nucleic acids from whole cell samples, 
viruses and the like, amplification of nucleic acids, fragmentation, transcription, 
5 labeling and/or extension reactions. One or more of these various operations may 
be readily incorporated into the device of the present invention. 

DNA Extraction 

10 DNA extraction may be relevant in case possible mutations in the genes are to be 
dtermined in addition to the determination of expression of the genes. 

For those embodiments where whole cells, or other tissue samples are t>eing 
analyzed, it will typically be necessary to extract the nucleic acids from the cells or 
15 viruses, prior to continuing with the various sample preparation operations. 
Accordingly, following sample collection, nucleic acids may be liberated from the 
collected cells, viral coat, etc., into a crude extract, followed by additional treatments 
to prepare the sample for subsequent operations, e.g., denaturation of 
contaminating (DNA binding) proteins, purification, filtration, desalting, and the like. 

20 

Liberation of nucleic acids from the sample cells, and denaturation of DNA binding 
proteins may generally be performed by physical or chemical methods. For 
example, chemical methods generally employ lysing agents to disrupt the cells and 
extract the nucleic acids from the cells, followed by treatment of the extract with 
25 chaotropic salts such as guanidinlum isothiocyanate or urea to denature any 
contaminating and potentially interfering proteins. 

Alternatively, physical methods may be used to extract the nucleic acids and 
denature DNA binding proteins, such as physical protrusions within microchannels 
30 or sharp edged particles piercing ceil membranes and extract their contents. 
Combinations of such structures with piezoelectric elements for agitation can 
provide suitable shear forces for lysis. 

l\4ore traditional methods of cell extraction may also be used, e.g., employing a 
35 channel with restricted cross-sectional dimension which causes cell lysis when the 
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sample is passed through the channel with sufficient flow pressure. Aitematively, 
cell extraction and denaturing of contaminating proteins may be carried out by 
applying an alternating electrical current to the sample. More specifically, the sample 
of cells is flowed through a microtubular array while an alternating electric current is 
5 applied across the fluid flow. Subjecting cells to ultrasonic agitation, or forcing cells 
through microgeometry apertures, thereby subjecting the cells to high shear stress 
resulting in rupture are also possible extraction methods. 

Filtration 

10 

Following extraction, it will often be desirable to separate the nucleic acids from 
other elements of the crude extract, e.g., denatured proteins, cell membrane 
particles, salts, and the like. Removal of particulate matter is generally accomplished 
by filtration, flocculation or the like. Further, where chemical denaturing methods are 

15 used, it may be desirable to desalt the sample prior to proceeding to the next step. 
Desalting of the sample, and isolation of the nucleic acid may generally be carried 
out in a single step, e.g., by binding the nucleic acids to a solid phase and washing 
away the contaminating salts or performing gel filtration chromatography on the 
sample, passing salts through dialysis membranes, and the like. Suitable solid 

20 supports for nucleic acid binding Include, e.g., diatomaceous earth, silica (i.e., glass 
wool), or the like. Suitable gel exclusion media, also well known in the art, may also 
be readily Incorporated into the devices of the present invention, and is 
commercially available from, e.g., Pharmacia and Sigma Chemical. 

25 Aitematively. desalting methods may generally take advantage of the high 
electrophoretic mobility and negative of DNA compared to other elements. 
Electrophoretic methods may also be utilized in the purification of nucleic acids from 
other cell contaminants and debris. Upon application of an appropriate electric field, 
the nucleic acids present in the sample will migrate toward the positive electrode 

30 and become trapped on the capture membrane. Sample impurities remaining free of 
the membrane are then washed away by applying an appropriate fluid flow. Upon 
reversal of the voltage, the nucleic acids are released from the membrane in a 
substantially purer form. Further, coarse filters may also be overtaid on the barriers 
to avoid any fouling of the barriers by particulate matter, proteins or nucleic acids. 
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thereby permitting repeated use. 

Separation of contaminants by chromatography 

5 in a similar aspect, the high electrophoretic mobility of nucleic acids with their 
negative charges, may be utilized to separate nucleic acids from contaminants by 
utilizing a short column of a gel or other appropriate matrix or gel which will slow or 
retard the flow of other contaminants while allowing the faster nucleic acids to pass. 

10 This invention provides nucleic acid affinity matrices that bear a large number of 
different nucleic acid affinity ligands allowing the simultaneous selection and 
removal of a large number of preselected nucleic acids from the sample. Methods of 
producing such affinity matrices are also provided. In general the methods involve 
the steps of a) providing a nucleic acid amplification template array comprising a 

15 surface to which are attached at least 50 oligonucleotides having different nucleic 
acid sequences, and wherein each different oligonucleotide is localized in a 
predetermined region of said surface, the density of said oligonucleotides is greater 
than atx^ut 60 different oligonucleotides per 1 cm.sup.2, and all of said different 
oligonucleotides have an identical terminal 3* nucleic acid sequence and an identical 

20 terminal 5' nucleic acid sequence, b) amplifying said multiplicity of oligonucleotides 
to provide a pool of amplified nucleic acids; and c) attaching the pool of nucleic 
acids to a solid support. 

For example, nucleic acid affinity chromatography is based on the tendency of 
25 complementary, single-stranded nucleic acids to form a double-stranded or duplex 
structure through complementary base pairing. A nucleic acid (either DNA or RNA) 
can easily be attached to a solid substrate (matrix) where it acts as an immobilized 
ligand that interacts with and forms duplexes with complementary nucleic acids 
present in a solution contacted to the immobilized ligand. Unbound components can 
30 be washed away from the bound complex to either provide a solution lacking the 
target molecules bound to the affinity column, or to provide the isolated target 
molecules themselves. The nucleic acids captured in a hybrid duplex can be 
separated and released from the affinity matrix by denaturation either through heat, 
adjustment of salt concentration, or the use of a destabilizing agent such as 
35 formamide, TWEEN.TM.-20 denaturing agent, or sodium dodecyl sulfate (SDS). 
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Affinity columns (matrices) are typically used either to isolate a single nucleic acid 
typically by providing a single species of affinity ligand. Altematively, affinity columns 
bearing a single affinity ligand (e,g. oligo dt columns) have been used to isolate a 
5 multiplicity of nucleic acids where the nucleic acids all share a common sequence 
(e.g, a polyA). 

Affinity matrices 

10 The type of affinity matrix used depends on the purpose of the analysis. For 

example, where it is desired to analyze mRNA expression levels of particular genes 
in a complex nucleic acid sample (e.g., total mRNA) it is often desirable to eliminate 
nucleic acids produced by genes that are constitutively overexpressed and thereby 
tend to mask gene products expressed at characteristically lower levels. Thus, in 

15 one embodiment, the affinity matrix can be used to remove a number of preselected 
gene products (e.g., actin, GAPDH, etc.). This is accomplished by providing an 
affinity matrix bearing nucleic acid affinity ligands complementary to the gene 
products (e.g., mRNAs or nucleic acids derived therefrom) or to subsequences 
thereof. Hybridization of the nucleic acid sample to the affinity matrix will result in 

20 duplex formation between the affinity ligands and their target nucleic acids. Upon 
elution of the sample from the affinity matrix, the matrix will retain the duplexes 
nucleic acids leaving a sample depleted of the overexpressed target nucleic acids. 

The affinity matrix can also be used to identify unknown mRNAs or cDNAs in a 
25 sample. Where the affinity matrix contains nucleic acids complementary to every 
known gene (e.g., in a cDNA library, DNA reverse transcribed from an mRNA, 
mRNA used directly or amplified, or polymerized from a DNA template) in a sample, 
capture of the known nucleic acids by the affinity matrix leaves a sample enriched 
for those nucleic acid sequences that are unknown. In effect, the affinity matrix Is 
30 used to perform a subtractive hybridization to isolate unknown nucleic acid 
sequences. The remaining "unknown" sequences can then be purified and 
sequenced according to standard methods. 

The affinity matrix can also be used to capture (isolate) and thereby purify unknown 
35 nucleic acid sequences. For example, an affinity matrix can be prepared that 
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contains nucleic acid (affinity ligands) that are complementary to sequences not 
previously identified, or not previously known to be expressed in a particular nucleic 
acid sample. The sample is then hybridized to the affinity matrix and those 
sequences that are retained on the affinity matrix are "unknown" nucleic acids. The 
5 retained nucleic acids can be eluted from the matrix (e.g. at increased temperature, 
increased destabilizing agent concentration, or decreased salt) and the nucleic acids 
can then be sequenced according to standard methods. 

Similariy, the affinity matrix can be used to efficientiy capture (isolate) a number of 
10 known nucleic acid sequences. Again, the matrix is prepared bearing nucleic acids 
complementary to tiiose nucleic acids it is desired to isolate. The sample is 
contacted to the matrix under conditions where the complementary nucleic acid 
sequences hybridize to the affinity ligands in the matrix. The non-hybridized material 
is washed off the matrix leaving the desired sequences bound. The hybrid duplexes 
15 are then denatured providing a pool of the isolated nucleic acids. The different 
nucleic acids in tiie pool can be subsequentiy separated according to standard 
methods (e.g. gel electrophoresis). 

As indicated above the affinity matrices can be used to selectively remove nucleic 
20 acids from virtually any sample containing nucleic acids (e.g., in a cDNA library, 
DNA reverse transcribed from an mRNA, mRNA used directly or amplified, or 
polymerized from a DNA template, and so forth). The nucleic acids adhering to the 
column can be removed by washing with a tow salt concentration buffer, a buffer 
containing a destabilizing agent such as formamide, or by elevating tiie column 
25 temperature. 

In one particulariy preferred embodiment, the affinity matrix can be used in a method 
to enrich a sample for unknown RNA sequences (e.g. expressed sequence tags 
(ESTs)). The method involves first providing an affinity matrix bearing a library of 
30 oligonucleotide probes specific to known RNA (e.g., EST) sequences. Then, RNA 
from undifferentiated and/or unactivated cells and RNA from differentiated or 
activated or pathological (e.g., transformed) or othenAfise having a different 
metabolic state are separately hybridized against the affinity matrices to provide two 
pools of RNAs lacking the known RNA sequences. 

35 
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In a preferred embodiment, the affinity matrix is packed into a columnar casing. The 
sample is then applied to the affinity matrix (e.g. Injected onto a column or applied to 
a column by a pump such as a sampling pump driven by an autosampler). The 
affinity matrix (e.g. affinity column) bearing the sample is subjected to conditions 
5 under which the nucleic acid probes comprising the affinity matrix hybridize 
specifically with complementary target nucleic acids. Such conditions are 
accomplished by maintaining appropriate pH, salt and temperature conditions to 
facilitate hybridization as discussed above. 

10 For a number of applications, it may be desirable to extract and separate messenger 
RNA from cells, cellular debris, and other contaminants. As such, the device of the 
present invention may, in some cases, include an mRNA purification chamber or 
channel. In general, such purification takes advantage of the poly-A tails on mRNA. 
In particular and as noted above, poly- T oligonucleotides may be immobilized 

15 within a chamber or channel of the device to serve as affinity ligands for mRNA. 
Poly-T oligonucleotides may be immobilized upon a solid support incorporated 
within the chamber or channel, or alternatively, may be immobilized upon the 
surface(s) of the chamber or channel itself. Immobilization of oligonucleotides on the 
surface of the chambers or channels may be carried out by methods described 

20 herein including, e.g., oxidation and silanation of the surface followed by standard 
DMT synthesis of the oligonucleotides. 

In operation, the lysed sample is introduced to a high salt solution to increase the 
ionic strength for hybridization, whereupon the mRNA will hybridize to the 
25 Immobilized poly-T. The mRNA bound to the immobilized poly-T oligonucleotides is 
then washed free in a low ionic strength buffer. The poy-T oligonucleotides may be 
immobiliized upon poroussurfaces, e.g., porous silicon, zeolites silica xerogels. 
scintered particles, or other solid supports. 

30 Hybridization 

Following sample preparation, the sample can be subjected to one or more different 
analysis operations. A variety of analysis operations may generally be performed, 
including size based analysis using, e.g., microcapillary electrophoresis, and/or 
35 sequence based analysis using, e.g., hybridization to an oligonucleotide array. 



;0OCtO: <WO 0149S7gA2 I > 



wo 01/49879 PCT/DKOO/00744 

48 

In the latter case, the nucleic acid sample may be probed using an array of 
oligonucleotide probes. Oligonucleotide arrays generally include a substrate having 
a large number of positionally distinct oligonucleotide probes attached to the 
5 substrate. These arrays may be produced using mechanical or light directed 
synthesis methods which incorporate a combination of photolithographic methods 
and solid phase oligonucleotide synthesis methods. 

Light directed synthesis of oligonucleotide anravs 

10 

The basic strategy for light directed synthesis of oligonucleotide arrays is as follows. 
The surface of a solid support, modified with photosensitive protecting groups is 
illuminated through a photolithographic mask, yielding reactive hydroxyl groups in 
the illuminated regions. A selected nucleotide, typically in the form of a 3'-0* 

15 phosphoramidite-activated deoxynucleoside (protected at the 5' hydroxyl with a 
photosensitive protecting group), is then presented to the surface and coupling 
occurs at the sites that were exposed to light. Following capping and oxidation, the 
substrate is rinsed and the surface is illuminated through a second mask, to expose 
additional hydroxyl groups for coupling. A second selected nucleotide (e.g., 5'- 

20 protected, 3'-0-phosphoramidite-activated deoxynucleoside) is presented to the 
surface. The selective deprotection and coupling cycles are repeated until the 
desired set of products is obtained. Since photolithography is used, the process can 
be readily miniaturized to generate high density arrays of oligonucleotide probes. 
Furthermore, the sequence of the oligonucleotides at each site is known. See, 

25 Pease, et al. Mechanical synthesis methods are similar to the light directed methods 
except involving mechanical direction of fluids for deprotection and addition in the 
synthesis steps. 

For some embodiments, oligonucleotide arrays may be prepared having all possible 
30 probes of a given length. The hybridization pattern of the target sequence on the 
array may be used to reconstruct the target DNA sequence. Hybridization analysis 
of large numbers of probes can be used to sequence long stretches of DNA or 
provide an oligonucleotide array which is specific and complementary to a particular 
nucleic acid sequence. For example, in particularly preferred aspects, the 
35 oligonucleotide array will contain oligonucleotide probes which are complementary 
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to specific target sequences, and individual or multiple mutations of these. Such 
arrays are particularly useful in the diagnosis of specific disorders which are 
characterized by the presence of a particular nucleic acid sequence. 

5 Following sample collection and nucleic acid extraction, the nucleic acid portion of 
the sample is typically subjected to one or more preparative reactions. These 
preparative reactions include in vitro transcription, labeling, fragmentation, 
amplification and other reactions. Nucleic acid amplification increases the number of 
copies of the target nucleic acid sequence of interest. A variety of amplification 
10 methods are suitable for use in the methods and device of the present invention, 
including for example, the polymerase chain reaction method or (PGR), the ligase 
chain reaction (LCR), self sustained sequence replication (3SR), and nucleic acid 
based sequence amplification (NASBA). 

15 The latter two amplification methods involve isothermal reactions based on 
isothermal transcription, which produce both single stranded RNA (ssRNA) and 
double stranded DNA (dsDNA) as the amplification products in a ratio of 
approximately 30 or 100 to 1, respectively. As a result, where these latter methods 
are employed, sequence analysis may be carried out using either type of substrate, 

20 i.e., complementary to either DNA or RNA. 

FrequenUy, it is desirable to amplify the nucleic acid sample prior to hybridization. 
One of skill in the art will appreciate that whatever amplification method is used, if a 
quantitative result is desired, care must be taken to use a method that maintains or 
25 controls for the relative frequencies of the amplified nucleic acids. 

PGR 

Methods of "quantitative" amplification are well known to those of skill in the art. For 
30 example, quantitative PGR involves simultaneously co-amplifying a known quantity 
of a control sequence using the same primers. This provides an internal standard 
that may be used to calibrate the PGR reaction. The high density array may then 
include probes specific to the internal standard for quantification of the amplified 
nucleic acid. 

35 
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Thus, in one embodiment, this invention provides for a method of optimizing a probe 
set for detection of a particular gene. Generally, this method involves providing a 
high density array containing a multiplicity of probes of one or more particular 
length(s) that are complementary to subsequences of the mRNA transcribed by the 
5 target gene. In one embodiment the high density array may contain every probe of a 
particular length that is complementary to a particular mRNA. The probes of the high 
density array are then hybridized with their target nucleic acid alone and then 
hybridized with a high complexity, high concentration nucleic add sample that does 
not contain the targets complementary to the probes. Thus, for example, where the 

10 target nucleic acid is an RNA, the probes are first hybridized with their target nucleic 
acid alone and then hybridized with RNA made from a cDNA library (e.g., reverse 
transcribed polyA.sup.+ mRNA) where the sense of the hybridized RNA is opposite 
that of the target nucleic acid (to insure that the high complexity sample does not 
contain targets for the probes). Those probes that show a strong hybridization signal 

15 with their target and little or no cross-hybridization with the high complexity sample 
are preferred probes for use in the high density arrays of this invention. 

PGR amplification generally involves the use of one strand of the target nucleic acid 
sequence as a template for producing a large number of complements to that 

20 sequence. Generally, two primer sequences complementary to different ends of a 
segment of the complementary strands of the target sequence hybridize with their 
respective strands of the target sequence, and in the presence of polymerase 
enzymes and nucleoside triphosphates, the primers are extended along the target 
sequence. The extensions are melted from the target sequence and the process is 

25 repeated, this time with the additional copies of the target sequence synthesized in 
the preceding steps. PGR amplification typically involves repeated cycles of 
denaturation, hybridization and extension reactions to produce sufficient amounts of 
the target nucleic acid. The first step of each cycle of the PGR involves the 
separation of the nucleic acid duplex formed by the primer extension. Once the 

30 strands are separated, the next step in PGR involves hybridizing the separated 
strands with primers that flank the target sequence. The primers are then extended 
to form complementary copies of the target strands. For successful PGR 
amplification, the primers are designed so that the position at which each primer 
hybridizes along a duplex sequence is such that an extension product synthesized 

35 from one primer, when separated from the template (complement), serves as a 
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template for the extension of the other primer. The cycle of denaturation. 
hybridization, and extension is repeated as many times as necessary to obtain the 
desired amount of amplified nucleic acid. 

5 In PGR methods, strand separation is nomnally achieved by heating the reaction to a 
sufficiently high temperature for a sufficient time to cause the denaturation of the 
duplex but not to cause an irreversible denaturation of the polymerase. Typical heat 
denaturation involves temperatures ranging from about SO.degree. C. to lOS.degree. 
C. for times ranging from seconds to minutes. Strand separation, however, can be 
10 accomplished by any suitable denaturing method including physical, chemical, or 
enzymatic means. Strand separation may be induced by a helicase, for example, or 
an enzyme capable of exhibiting helicase activity. 

In addition to PGR and IVT reactions, the methods and devices of the present 
15 Invention are also applicable to a number of other reaction types, e.g., reverse 
transcription, nick translation, and the like. 

Lat>ellina before hvbridization 

20 The nucleic acids in a sample will generally be labeled to facilitate detection in 
subsequent steps. Labeling may be carried out during the amplification, in vitro 
transcription or nick translation processes. In particular, amplification, in vitro 
transcription or nick translation may incorporate a label into the amplified or 
transcribed sequence, either through the use of labeled primers or the incorporation 

25 of labeled dNTPs into the amplified sequence. 

Hybridization between the sample nucleic acid and the oligonucleotide probes upon 
the array is then detected, using, e.g., epifluorescence confocal microscopy. 
Typically, sample is mixed during hybridization to enhance hybridization of nucleic 
30 acids in the sample to nucleoc acid probes on the array. 

Labelling after hybridization 

In some cases, hybridized oligonucleotides may be labeled following hybridization. 
35 For example, where biotin labeled dNTPs are used in, e.g., amplification or 
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transcription, streptavidin linked reporter groups may be used to label hybridized 
complexes. Such operations are readily integratable Into the systems of the present 
invention. Alternatively, the nucleic acids in the sample may be labeled following 
amplification. Post amplification lat>6ling typically involves the covalent attachment 
5 of a particular detectable group upon the amplified sequences. Suitable labels or 
detectable groups Include a variety of fluorescent or radioactive labeling groups well 
known in the art. These labels may also be coupled to the sequences using 
methods that are well known in the art. 

10 Methods for detection depend upon the label selected. A fluorescent label is 
preferred because of its extreme sensitivity and simplicity. Standard labeling 
procedures are used to determine the positions where Interactions between a 
sequence and a reagent take place. For example, if a target sequence is labeled 
and exposed to a matrix of different probes, only those locations where probes do 

15 interact with the target will exhibit any signal. Alternatively, other methods may be 
used to scan the matrix to detemilne where interaction takes place. Of course, the 
spectrum of interactions may be detemiined in a temporal manner by repeated 
scans of Interactions which occur at each of a multiplicity of conditions. However, 
instead of testing each individual interaction separately, a multiplicity of sequence 

20 interactions may be simultaneously determined on a matrix. 

Means of detecting labeled target (sample) nucleic acids hybridized to the probes of 
the high density array are known to those of skill in the art. Thus, for example, where 
a colorimetric label is used, simple visualization of the label is sufficient. Where a 
25 radioactive labeled probe is used, detection of the radiation (e.g with photographic 
film or a solid state detector) is sufficient. 

In a preferred embodiment, however, the target nucleic acids are labeled with a 
fluorescent label and the localization of the label on the probe array is accomplished 
30 with fluorescent microscopy. The hybridized array is excited with a light source at 
the excitation wavelength of the particular fluorescent label and the resulting 
fluorescence at the emission wavelength is detected. In a particulariy preferred 
embodiment, the excitation light source is a laser appropriate for the excitation of the 
fluorescent label. 

35 
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The target polynucleotide may be labeled by any of a number of convenient 
detectable markers. A fluorescent label is preferred because it provides a very 
strong signal with low background. It is also optically detectable at high resolution 
and sensitivity through a quick scanning procedure. Other potential labeling moieties 
5 include, radioisotopes, chemituminescent compounds, lat>eled binding proteins, 
heavy metal atoms, spectroscopic markers, magnetic labels, and linked enzymes. 
Another method for labeling may bypass any label of the target sequence. The 
target may be exposed to the probes, and a double strand hybrid is formed at those 
positions only. Addition of a double strand specific reagent will detect where 
10 hybridization takes place. An intercatative dye such as ethidium bromide may be 
used as long as the probes themselves do not fold back on themselves to a 
significant extent forming hairpin loops. However, the length of the hairpin loops in 
short oligonucleotide probes would typically be insufficient to form a stable duplex. 

15 Suitable chromogens will include molecules and compounds which absorb light In a 
distinctive range of wavelengths so that a color may be observed, or emit light when 
irradiated with radiation of a particular wave length or wave length range, e.g., 
fluorescers. Biliproteins, e.g., phycoerythrin, may also serve as labels, 

20 A wide variety of suitable dyes are available, being primarily chosen to provide an 
intense color with minimal absorption by their surroundings. Illustrative dye types 
include quinoline dyes, triarylmethane dyes, acridine dyes, alizarine dyes, 
phthaleins, insect dyes, azo dyes, anthraquinoid dyes, cyanine dyes, 
phenazathionium dyes, and phenazoxonium dyes. 

25 

A wide variety of fluorescers may be employed either by themselves or in 
conjunction with quencher molecules. Fluorescers of interest fail into a variety of 
categories having certain primary functionalities. These primary functionalities 
include 1- and 2-aminonaphthalene, p,p'-diaminostilbenes, pyrenes, quaternary 

30 phenanthridine salts, 9-aminoacridines. p.p'-diaminobenzophenone imines, 
anthracenes, oxacarbocyanine. merocyanine. 3-aminoequilenin, perylene, bis- 
benzoxazole, bis-p-oxazolyl benzene, 1 ,2-benzophenazin, retinol, bis-3- 
aminopyridinium salts. hellebrigenin, tetracycline, sterophenol, 
benzimidzaolylphenylamine. 2-oxo-3-chromen, indole, xanthen, 7-hydroxycoumarin, 

35 phenoxazine, salicylate, strophanthidin, porphyrins, triarylmethanes and flavin. 
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Individual fluorescent compounds which have functionalities for linking or which can 
be modified to incorporate such functionalities include, e.g.. dansyl chloride; 
fluoresceins such as 3,6<lihydroxy-9-phenylxanthhydrol; rhodamineisothiocyanate; 
N-phenyi 1 -amino-8-sulfonatonaphthalene; N-phenyl 2-amino-6- 

5 sulfonatonaphthalene; 4-acetamido-4'isothiocyanato-stilbene-2,2-disulfonic acid; 
pyrene-3-sulfonic acid; 2-toluidinonaphthalene-6-sulfonate; N-phenyl. N-methyi 2- 
aminoaphthalene-6-sulfonate; ethidium bromide; stebrine; auromine-0,2-(9'* 
anthroyOpalmitate; dansyl phosphatidylethanolamine; N.N'-dioctadecyl 
oxacarbocyanine; N.N'-dihexyl oxacarbocyanine; merocyanine, 4- 

1 0 (3'pyrenyl)butyrate; d-3-amlnodesoxy-equilenin; 1 2-(9'-anthroyl)stearate; 2- 
methyianthracene; 9-vinyianthracene; 2.2'-(vinylene-p-phenylene)bisbenzoxazole; 
p-bis>2-(4-methyl-5-phenyl-oxa20lyl)lbenzene; 6-dimethylamino-1 ,2-benzophena2in; 
retinol; bis(3-aminopyridinium) 1.10-decandiyi ditodide; sulfonaphthylhydrazohe of 
hellibrienin; chlorotetracycllne; N-{7-dimethylamino-4-methyl-2-oxo-3- 

15 chromenyl)maleimide; N->p-(2-benzimidazolyl)-phenyl!maleimide; N-(4- 

fluoranthyl)maleimide; bis(homovanillic acid); resazarin; 4-chloro-7-nitro-2,1 ,3- 
benzooxadiazole; merocyanine 540; resorufin; rose bengal; and 2,4<llphenyl-3(2H)- 
furanone. 

20 Desirably, fluorescers should absorb light above about 300 nm, preferably about 
350 nm. and more preferably above about 400 nm, usually emitting at wavelengths 
greater than about 10 nm higher than the wavelength of the light absorbed. It should 
be noted that the absorption and emission characteristics of the bound dye may 
differ from the unbound dye. Therefore, when referring to the various wavelength 

25 ranges and characteristics of the dyes, it is intended to indicate the dyes as 
employed and not the dye which is unconjugated and characterized in an arbitrary 
solvent. 

Fluorescers are generally preferred because by irradiating a fluorescer with light, 
30 one can obtain a plurality of emissions. Thus, a single label can provide for a 
plurality of measurable events. 

Detectable signal may also be provided by chemiluminescent and bioluminescent 
sources. Chemiluminescent sources include a compound which becomes 
35 electronically excited by a chemical reaction and may then emit light which serves 
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as the detectible signal or donates energy to a fluorescent acceptor. A diverse 
number of families of compounds have been found to provide chemiluminescence 
under a variety of conditions. One family of compounds is 2,3-dihydro-1 .-4- 
phthalazinedione. The most popular compound is luminol, which is the 5-amino 
5 compound. Other members of the family include the 5-amino-6,7,8-trimethoxy- and 
the dimethylamino>ca!benz analog. These compounds can be made to luminesce 
with alkaline hydrogen peroxide or calcium hypochlorite and base. Another family of 
compounds is the 2,4,5-triphenylimidazoles. with lophine as the common name for 
the parent product. Chemiluminescent analogs include para-dimethylamino and - 
10 methoxy substituents. Chemiluminescence may also be obtained with oxalates, 
usually oxalyl active esters, e.g., p-nitrophenyl and a peroxide, e.g., hydrogen 
peroxide, under basic conditions. Alternatively, luciferins may be used in conjunction 
with luciferase or lucigenins to provide bioluminescence. 

15 Spin labels are provided by reporter molecules with an unpaired electron spin which 
can be detected by electron spin resonance (ESR) spectroscopy. Exemplary spin 
lat>els include organic free radicals, transitional metal complexes, particularly 
vanadium, copper, iron, and manganese, and the like. Exemplary spin labels include 
nitroxide free radicals. 

20 

Fragmentation . 

In addition, amplified sequences may be subjected to other post amplification 
treatments. For example, in some cases, it. may be desirable to fragment the 
25 sequence prior to hybridization with an oligonucleotide array, in order to provide 
segments which are more readily accessible to the probes, which avoid looping 
and/or hybridization to multiple probes. Fragmentation of the nucleic acids may 
generally be carried out by physical, chemical or enzymatic methods that are known 
in the art. 

30 

Sample Analysis 

Following the various sample preparation operations, the sample will generally be 
subjected to one or more analysis operations. Particularly preferred analysis 
35 operations include, e.g., sequence based analyses using an oligonucleotide array 
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and/or size based analyses using, e.g., microcapillary array electrophoresis. 

Capillary Electrophoresis 

5 In some embodiments, it may be desirable to provide an additional, or alternative 
means for analyzing the nucleic.acids from the sample 

Microcapillary array electrophoresis generally involves the use of a thin capillary or 
channel which may or may not be filled with a particular separation medium. 

10 Electrophoresis of a sample through the capillary provides a size based separation 
profile for the sample. Microcapillary array electrophoresis generally provides a rapid 
method for size based sequencing, PGR product analysis and restriction fragment 
sizing. The high surface to volume ratio of these capillaries allows for the application 
of higher electric fields across the capillary without substantial thermal variation 

15 across the capillary, consequently allowing for more rapid separations. Furthermore, 
when combined with confocal imaging methods, these methods provide sensitivity in 
the range of attomoles, which is comparable to the sensitivity of radioactive 
sequencing methods. 

20 in many capillary electrophoresis methods, the capillaries, e.g., fused silica 
capillaries or channels etched, machined or molded into planar substrates, are filled 
with an appropriate separation/sieving matrix. Typically, a variety of sieving matrices 
are known in the art may be used in the microcapillary arrays. Examples of such 
matrices include, e.g., hydroxyethyl cellulose, polyacrylamide, agarose and the like. 

25 Gel matrices may be Introduced and polymerized within the capillary channel. 
However, in some cases, this may result in entrapment of bubbles within the 
channels which can interfere with sample separations. Accordingly, it is often 
desirable to place a preformed separation matrix within the capillary channel(s), 
prior to mating the planar elements of the capillary portion. Rxing the two parts, e.g., 

30 through sonic welding, permanently fixes the matrix within the channel. 
Polymerization outside of the channels helps to ensure that no bubbles are formed. 
Further, the pressure of the welding process helps to ensure a void-free system. 

In addition to its use in nucleic acid "fingerprinting" and other sized based analyses, 
35 the capillary arrays may also be used in sequencing applications. In particular, gel 
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based sequencing techniques may be readily adapted for capillary array 
electrophoresis. 



Expression products 

5 

In addition to detection of mRNA or as the sole detection method expression 
products from the genes discussed above may be detected as indications of the 
biological condition of the tissue. Expression products may be detected in either the 
tissue sample as such, or in a body fluid sample, such as blood, serum, plasma, 
10 faeces, mucus, sputum, cerebrospinal fluid, and/or urine of the individual. 

The expression products, peptides and proteins, may be detected by any suitable 
technique known to the person skilled in the art. 

15 In a preferred embodiment the expression products are detected by means of 
specific antibodies directed to the various expression products, such as 
immunofluorescent and/or immunohistochemical staining of the tissue. 

Immunohistochemical localization of expressed proteins may be carried out by 
20 immunostaining of tissue sections from the single tumors to determine which cells 
expressed the protein encoded by the transcript in question. The transcript levels 
were used to select a group of proteins supposed to show variation from sample to 
sample, making possible a rough correlation between level of protein detected and 
intensity of the transcript on the microarray. 

25 

For example sections were cut from paraffin-embedded tissue blocks, mounted, and 
deparaffinized by incubation at 80 C for 10 min, followed by immersion in heated oil 
at 60 C for 10 min (Estisol 312. Estichem A/S, Denmark) and rehydration.. Antigen 
retrieval is achieved in TEG (TrisEDTA-Glycerol) buffer using microwaves at 900 W. 

30 The tissue sections cooled in the buffer for 1 5 min before a brief rinse in tap water. 
Endogenous peroxidase activity is blocked by incubating the sections with 1% H202 
for 20 min, followed by three rinses in tap water, 1 min each. The sections are then 
soaked in PBS buffer for 2 min. The next steps are modified from the descriptions 
given by Oncogene Science Inc., in the Mouse Immunohistochemistry Detection 

35 System, XHC01 (UniTect, Uniondale, NY, USA). Briefly, the tissue sections are 
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Incubated overnight at 4 C with primary antibody (against beta-2 microglobulin 
(Dako), cytokeratin 8, cystatin-C (both from Europa. US), junB, CD59. E-cadherin, 
apo-E. cathepsin E, vimentin, IGFII (all from Santa Cruz), followed by three rinses in 
PBS buffer for 5 min each. AftenA^ards, the sections are incubated with blotinylated 
5 secondary antibody for 30 min, rinsed three times with PBS buffer and subsequently 
incubated with ABC (avidin-biotinlylated horseradish peroxidase complex) for 30 
min, followed by three rinses In PBS buffer 

Staining is performed by incubation with AEC (3-amino-ethylcarbazole) for 10 min, 
10 The tissue sections are counter stained with Mayers hematoxylin, washed in tap 
water for 5 min. and mounted with glycerol-gelatin. Positive and negative controls 
may be included in each staining round with alt antibodies. 

In yet another emtx)diment the expression products may be detected by means of 
15 conventional enzyme assays, such as ELISA methods. 

Furthermore, the expression products may be detected by means of peptide/protein 
chips capable of specifically binding the peptides and/or proteins assessed. Thereby 
an expression pattern may be obtained. 

20 

Assay 

Thus, in a further aspect the invention relates to an assay for determining an ex- 
pression pattern of a colon and/or rectum cell, comprising at least a first marker 
25 and/or a second marker, wherein the first marker is capable of detecting a gene 

from a first gene group as defined above, and the second marker is capable of de- 
tecting a gene from a second gene group as defined above. 

in a preferred emtxxjlment the assay comprises at least two markers for each gene 
30 group. 

correlating the first expression level and the second expression level to a standard 
level of the assessed genes to determine the presence or absence of a biological 
condition in the animal tissue. 

35 
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The marker (s) are preferably specifically detecting a gene as identified herein, in 
particular the genes of the tables in the examples and as discussed above. 

As discussed above the marker may be any nucleotide probe, such as a DNA, RNA, 
5 PNA, or LNA probe capable of hybridising to mRNA indicative of the expression 
level. The hybridisation conditions are preferably as described below for probes. 

In another embodiment the marker is an antibody capable of specifically binding the 
expression product in question. 

10 

Detection 

Patterns can be compared manually by a person or by a computer or other machine. 
An algorithm can be used to detect similarities and differences. The algorithm may 

15 score and compare, for example, the genes which are expressed and the genes 
which are not expressed. Alternatively, the algorithm may look for changes in 
intensity of expression of a particular gene and score changes in intensity between 
two samples. Similarities may be determined on the basis of genes which are 
expressed in both samples and genes which are not expressed in both samples or 

20 on the basis of genes whose intensity of expression are numerically similar. 

Generally, the detection operation will be performed using a reader device external 
to the diagnostic device. However, it may be desirable in some cases, to Incorporate 
the data gathering operation into the diagnostic device itself. 

25 

The detection apparatus may t>e a fluorescence detector, or a spectroscopic 
detector, or another detector. 

Although hybridization is one type of specific interaction which is clearly useful for 
30 use in this mapping embodiment, antibody reagents may also be very useful. 

Data Gathering and Analvsis 

Gathering data from the various analysis operations, e.g., oligonucleotide and/or 
35 microcapillary arrays, will typically be carried out using methods known in the art. 



3CXX;iO:<WO 0149879A2 I > 



wo 01/49879 



60 



PCT/DKOO/00744 



For example, the arrays may be scanned using lasers to excite fluorescently labeled 
targets that have hybridized to regions of probe arrays mentioned above, which can 
then be Imaged using charged coupled devices ("CCDs") for a wide field scanning of 
the array. Alternatively, another particulariy useful method for gathering data from 
5 the arrays is through the use of laser confocal microscopy which combines the ease 
and speed of a readily automated process with high resolution detection. 

Following the data gathering operation, the data will typically be reported to a data 
analysis operation. To facilitate the sample analysis operation, the data obtained by 

10 the reader from the device will typically be analyzed using a digital computer. 
Typically, the computer will be appropriately programmed for receipt and storage of 
the data from the device, as well as for analysts and reporting of the data gathered, 
i.e.. interpreting fluorescence data to determine the sequence of hybridizing probes, 
normalization of background and single base mismatch hybridizations, ordering of 

15 sequence data in SBH applications, and the like. 

It is an object of the present invention to provide a biological sample which may be 
classified or characterized by analyzing the pattern of specific interactions 
mentioned above. This may be applicable to a cell or tissue type, to the messenger 
20 RNA population expressed by a cell to the genetic content of a cell, or to virtually 
any sample whrch can be classified and/or identified by its combination of specific 
molecular properties. 

Pharmaceutical composition 

25 

The invention also relates to a pharmaceutical composition for treating the bioligical 
condition, such as colorectal tumors. 

In one embodiment the pharmaceutical composition comprises one or more of the 
30 peptides being expression products as defined above. In a preferred embodiment, 
the peptides are bound to carriers. The peptides may suitably be coupled to a poly- 
mer carrier, for example a protein carrier, such as BSA. Such formulations are well- 
known to the person skilled in the art. 

35 The peptides may be suppressor peptides normally lost or decreased in tumor tis- 
sue administered in order to stabilise tumors towards a less malignant stage. In an- 
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Other embodiment the peptides are onco^peptides capable of eliciting an Immune 
response towards the tumor cells. 

In another embodiment the pharmaceutical composition comprises genetic material, 
either genetic material for substitution therapy, or for suppressing therapy as dis- 
cussed below. 



In a third embodiment the pharmaceutical composition comprises at least one anti- 
body produced as described above. 

10 

In the present context the term pharmaceutical composition is used synonymously 
with the term medicament. The medicament of the invention comprises an effective 
amount of one or more of the compounds as defined above, or a composition as 
defined above in combination with pharmaceuticaliy acceptable additives. Such me- 
15 dicament may suitably be formulated for oral, percutaneous, intramuscular, intrave- 
nous, intracranial, intrathecal, intracerebroventricular, intranasal or pulmonal ad- 
ministration. For most Indications a localised or substantially localised application is 
preferred. 

20 Strategies in formulation development of medicaments and compositions based on 
the compounds -of the present invention generally correspond to formulation strate- 
gies for any other protein-based drug product. Potential problems and the guidance 
required to overcome these problems are dealt with in several textt>ooks, e.g. 
Therapeutic Peptides and Protein Formulation. Processing and Delivery Systems", 

25 Ed. A.K. Banga, Technomic Publishing AG, Basel, 1995. 

Injectables are usually prepared either as liquid solutions or suspensions, solid 
forms suitable for solution in, or suspension in, liquid prior to injection. The prepara- 
tion may also be emulsified. The active Ingredient is often mixed with excipients 

30 which are pharmaceuticaliy acceptable and compatible with the active ingredient. 
Suitable excipients are, for example, water, saline, dextrose, glycerol, ethanol or the 
like, and combinations thereof. In addition, if desired, the preparation may contain 
minor amounts of auxiliary substances such as wetting or emulsifying agents, pH 
buffering agents, or which enhance the effectiveness or transportation of the prepa- 

35 ration. 
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Formulations of the compounds of the invention can be prepared by techniques 
known to the person skilled in the art. The formulations may contain pharmaceuti- 
cally acceptable carriers and excipients including microspheres, liposomes, micro- 
5 capsules, nanoparticles or the like. 

The preparation may suitably be administered by injection, optionally at the site, 
where the active ingredient is to exert its effect. Additional formulations which are 
suitable for other modes of administration include suppositories, and, in some 

10 cases, oral formulations* For suppositories, traditional binders and carriers include 
polyalkylene glycols or triglycerides. Such suppositories may be formed from mix- 
tures containing the active ingredient(s) in the range of from 0.5% to 10%, preferably 
1-2%. Oral formulations include such normally employed excipients as, for example, 
pharmaceutical grades of mannitot, lactose, starch, magnesium stearate, sodium 

15 saccharine, cellulose, magnesium cart)onate, and the like. These compositions take 
the form of solutions, suspensions, tablets, pills, capsules, sustained release for- 
mulations or powders and generally contain 10-95% of the active ingredient(s), pref- 
erably 25-70%. 

20 The preparations are administered in a manner compatible with the dosage formula- 
tion, and in such amount as will be therapeutically effective. The quantity to be ad- 
ministered depends on the subject to be treated, including, e.g. the weight and age 
of the subject, the disease to be treated and the stage of disease. Suitable dosage 
ranges are of the order of several hundred jjg active ingredient per administration 

25 with a preferred range of from about 0.1 i[/g to lOOOy/g, such as in the range of from 
about 1 //g to 300 //g, and especially in the range of from about 10 //g to 50 //g. Ad- 
ministration may be performed once or may be followed by subsequent administra- 
tions. The dosage will also depend on the route of administration and will vary with 
the age and weight of the subject to be treated. A preferred dosis would be in the 

30 Interval 30 mg to 70 mg per 70 kg body weight. 

Some of the compounds of the present invention are sufficientiy active, but for some 
of the others, the effect will be enhanced if the preparation further comprises phar- 
maceutically acceptable additives and/or carriers. Such additives and carriers will be 
35 known in the art. In some cases, it will be advantageous to include a compound. 
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which promote delivery of the active substance to its target. 

In many instances, it will be necessary to administrate the formulation multiple 
times. Administration may be a continuous infusion, such as intraventricular infusion 

5 or administration in more doses such as more times a day, daily, more times a 
week, weekly, etc. 

Vaccines 

10 in a further embodiment the present invention relates to a vaccine for the 

prophylaxis or treatment of a biological condition comprising at least one expression 
product from at least one gene said gene being expressed as defined above. 

The term vaccines is used with Its normal meaning, i.e preparations of immunogenic 
15 material for administration to induce in the recipient an immunity to infection or in- 
toxication by a given infecting agent. Vaccines may be administered by intravenous 
injection or through oral, nasal and/or mucosal administration. Vaccines may be 
either simple vaccines prepared from one species of expression products, such as 
proteins or peptides, or a variety of expression products, or they may be mixed vac- 
20 cines containing two or more simple vaccines. They are prepared in such a manner 
as not to destroy the immunogenic material, although the methods of preparation 
vary, depending on the vaccine. 

The enhanced immune response achieved according to the invention can be attrib- 
25 utable to e.g. an enhanced increase in the level of immunoglobulins or in the level of 
T-cells including cytotoxic T-cells will result in immunisation of at least 50% of indi- 
viduals exposed to said immunogenic composition or vaccine, such as at least 55%, 
for example at least 60%. such as at least 65%, for example at least 70%, for exam- 
ple at least 75%, such as at least 80%. for example at least 85%, such as at least 
30 90%, for example at least 92%. such as at least 94%, for example at least 96%, 
such as at least 97%, for example at least 98%. such as at least 98.5%, for example 
at least 99%, for example at least 99.5% of the individuals exposed to said immuno- 
genic composition or vaccine are immunised. 
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Compositions according to the Invention may also comprise any carrier and/or adju- 
vant known in the art including functional equivalents thereof. Functionally equiva- 
lent carriers are capable of presenting the same immunogenic determinant in es- 
sentially the same steric conformation when used under similar conditions. Func- 
5 tionally equivalent adjuvants are capable of providing similar increases In the effi- 
cacy of the composition when used under similar conditions. 

Therapy 

10 The invention further relates to a method of treating individuals suffering from the 
biological condition in question, in particular for treating a colorectal tumor. 

In one embodiment the Invention relates to a method of substitution therapy, ie. 
administration of genetic material generally expressed in normal cells, but lost or 
15 decreased in biological condition cells(tumor suppressors). Thus, the invention 

relates to a method for reducing cell tumorigenicity of a cell, said method comprising 

obtaining at least one gene selected from genes t>eing expressed in an amount two- 
fold higher in normal cells than the amount expressed in said tumor cell(tumor 
20 suppressors), 

introducing said at least one gene into the tumor cell in a manner allowing 
expression of said gene(s). 

25 The at least one gene is preferably selected Individually from genes comprising a 
sequence as identified below 
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qy to mouse Pcbpl - poly(rC)'binding 
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protein 1 


AA319615_at 




secretory carrier membrane protein; secre- 
tory carrier membrane protein 2; chrom 15 



and from 



"Human chromogranin A ""mRNA,"" complete cds" J03915 

Human adipsin/compiement factor D "mRNA," comple- M84526 
te cds 

Homo sapiens MLC-1 V/Sb isoform gene M24248 

Human aminopeptidase N/CD13 mRNA encoding IVi22324 
aminopeptidase "N," complete cds 

H.sapiens MT-1 1 mRNA X76717 

H.saplens GCAP-li gene Z70295 

Human somatostatin I gene and flanks J00306 

Human YMP "mRNA." complete cds U52101 

H.sapiens mRNA for beta subunit of epithelial amiloride- X87159 
sensitive sodium channel 

Human K12 protein precursor "mRNA," complete cds U77643 

Human sulfate transporter (DTD) "mRNA," complete cds U1 4528 

Human transcription factor hGATA-6 "mRNA," complete U66075 
cds. 

H.sapiens SCAD "gene," exon 1 and joining features Z80345 

Human S-iac lectin L-14-II (LGALS2) gene M87860 

Human mRNA for protein tyrosine phosphatase D1 5049 

H.sapiens mRNA for tetranectin X64559 

Human 1 1 kd protein "mRNA," complete cds U28249 

Human anti-mullerian hormone type II receptor precursor U29700 
"gene," complete cds 

Human heparin binding protein (HBp17) "mRNA," complete M60047 
cds 

Human ADP-ribosylation factor (hARFS) "mRNA," complete M57763 
cds 

beta -ADD=adducin beta subunit 63 kda isoform/membrane S81083 
skeleton protein, beta -ADD=adducin beta subunit 63 kda 
isoform/membrane skeleton protein {alternatively spliced, 
exon 10 to 13 region} [human, Genomic, 1851 nt, segment 
3 of 3]. 

Zinc Finger Protein Znf 1 55 HG4243- 

HT4513 

Human glucagon "mRNA," complete cds J04040 

H.sapiens mRNA for hair "keratin," hhbS X99140 

Human tubulin-folding cofactor E "mRNA," complete cds U61232 

Human integrin alpha-3 chain "mRNA," complete cds M5991 1 

Human NACP gene U46901 

H.sapiens mRNA for flavin-containing monooxygenase 5 Z47553 
(FM05) 

Human mRNA for ATF-a transcription factor X52943 

H.sapiens intestinal VIP receptor related protein mRNA X77777 
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In a preferred embodiment at least two different genes are introduced into the tumor 
cell. 

In another aspect the invention relates to a therapy whereby genes generally 
correlated to disease are inhibited by one or more of the following methods: 

A method for reducing cell tumorigenicity of a cell, said method comprising 

obtaining at least one nucleotide probe capable of hybridising with at least one gene 
of a tumor cell, said at least one gene being selected from genes being expressed in 
an amount at least one-fold lower in normal cells than the amount expressed in said 
tumor cell, and 

Introducing said at least one nucleotide probe into the tumor cell in a manner 
allowing the probe to hybridise to the at least one gene, thereby inhibiting 
expression of said at least one gene. This method is preferably based on anti-sense 
technology, whereby the hybridisation of said probe to the gene leads to a down- 
regulation of said gene. 

The down-regulation may of course also be based on a probe capable of hybridising 
to regulatory components of the genes in question, such as promoters. 

The probes are preferably selected from prot>es capable of hybridising to a 
nucleotide sequence comprising a sequence as identified below 



RC_AA609013_s 
at 


APPP 
P 


microsomal dipeptidase (also 
on 6.8k); chrom 16 


RC_AA232508_at 


APPP 
P 


CGI-89 protein; unnamed 
protein product; hypothetical 
protein 


RC_AA428964_at 


APPP 
P 


serine protease-like protease; 
serine protease homo- 
log=NES1 ; normal epithelial 
cell-specific 1 


RC_T52813_s_at 


APPP 
P 


dJ28O10.2 (G0S2 (PUTATIVE 
LYIVIPHOCYTE G0/G1 
SWITCH PROTEIN 2; chrom 1 


RC_AA075642_at 


APPP 
P 


gp-340 variant protein; 
DMBT1/8kb.2 protein 


RClAA007218_at 


APPP 


chrom 13 no homology 
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P 




RC_N33920_at 


APPP 
P 


ubiquitin-like protein FAT10; 
diubiquitin; dJ271M21.6 (Diu- 
btquitin); chrom 6 


RC_N71781_at 


APPP 
P 


KIAA1199 protein, chrom 15 


RC_R6727S_s_at 


APPP 
P 


alpha-1 (type XI) cxillagen pre- 
cursor; collagen, type XI. alpha 
1 ; collagen type XI alpha-1 
isoform A; chrom 1 


RC_W80763_at 


APPP 
P 


hypothetical protein; chrom 1 7 




APPP 
P 


chrom 7p22 AC006028 BAC 

clone 


RC_AA034499_S 
_at 


APPP 
P 


ZNF198 protein; zinc finger 
protein; FIM protein; Cys-rich 
protein; zinc finger protein 198; 
chrom 13 


RC^AA035482_at 


APPP 
P 


chrom 5; AK022505 clone; 
CalcineurinB (weakly similar) 


RC_AA024482_at 


APPP 
P 


hypothetical protein; unnamed 
protein product; chrom 17 


RCL.H93021^at^v 


APPP 
P 


chrom 2 ; XM_004890 pep- 
tidylprolyl isomerase A (cy- 
clophilin A) 


RC^AA427737_at 


APPP 
P 


no homology 


RC_AA417p78_at 


APPP 
P 


chrom 7q31; AF017104 clone 


M29873_s_at 


APPP 
P 


cytochrome P450-liB (hllB3) 
; 19q13.1-q13.2 


RC_H27498_f_at 


AAPP 
P 




RC_T92363_s_at 


AAPP 
P 




RC_N89910_at 


AAAP 
P 




RC_W60516_at 


AAAP 
P 




RC_AA219699_at 


AAAP 
P 




RC_AA449450_at 


AAAP 
P 





Or from 



Homo sapiens (clones "MDP4,'' MDP7) microsomal J05257 
dipeptidase (MDP) "mRNA," complete cds 

"Homo sapiens reg gene ""homologue,"" complete L08010 
cds" 
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H.sapiens mRNA for prepro-alpha2(l) cdiiagen 
"Human S-adenosylhomocysteine hydrolase (AHCY) 
""mRNA,"" complete cds" 
Transcription Factor Ilia 

Human gene for melanoma grovyrth stimulatory activity 
(MGSA) 

Human stromelysin-3 mRNA 

CDC25Hu2=cdc25+ homoiog "(human." °mRNA.° 3118 nt] 
Human mRNA for cripto protein 

Human transformation-sensitive protein (lEF SSP 3521) 
"mRNA," complete cds 

Human complement component 2 (C2) gene allele b 

H.sapiens mRNA for ITBA2 protein 

H.sapiens encoding CLA-I mRNA 

"Human fibroblast growth factor receptor 4 (FGFR4) 

""mRNA."" complete cds" 

"""Fibronectin."" Alt. Splice 1" 

tyk2 

Human mRNA for B-myb gene 

"Human phosphofructokinase (PFKM) ""mRNA."" complete 

cds" 

Human pre-B cell enhancing factor (PBEF) "mRNA," com- 
plete cds 

Human SH2-containing inositol 5-phosphatase (hSHIP) 
"mRNA." complete cds 

Human interleukin 8 (IL8) "gene," complete cds 
"Human lamin B receptor (LBR) ""mRNA."" complete cds" 
H.sapiens mRNA for protein tyrosine phosphatase 
Human mRNA for unc-18 "homologue." complete cds 
H.sapiens mRNA for Zn-alpha2-glycoprotein 

"Human asparagine synthetase "mRNA."" complete cds" 
Human hepatitis delta antigen interacting protein A (dipA) 
"mRNA," complete cds 

Human splicesomal protein (SAP 61) "mRNA," complete 
cds 

Human protein kinase C-binding protein RACK7 "mRNA," 
partial cds 

Human MAC30 "mRNA," 3' end 

Human thrombospondin 2 (THBS2) "mRNA," complete cds 
"Human nicotinamide N-methyltransferase (NNMT) 
""mRNA,"" complete cds" 
H.sapiens mRNA for type I interstitial collagenase 
Human cytochrome b561 gene 

Human H19 RNA "gene," complete cds (spliced in sill- 
co) 

Human collagen type XVIII alpha 1 (COL18A1) "mRNA." 
partial cds 

Human clone 23733 "mRNA," complete cds. 



Z74616 
M61832 

HG4312- 

HT4582 

X54489 

X57766 
S78187 
X14253 
M86752 

L09708 
X92896 
Z22555 
L03840 

HG3044- 
HT3742 
X54667 
XI 3293 
U24183 

U02020 

U57650 

M28130~ 

L25931 

Z48541 

D63851 

X59766 

Z25521 

fy/127396 

U63825 

U08815 

U48251 

LI 91 83 
LI 2350 
U08021 

X54925 
U29463 
M32053 

L22548 

U79274 
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In another embodiment the probes consists of the sequences identified above. 

The hybridization may be tested in vitro at conditions corresponding to in vivo 
5 conditions. Typically, hybridization conditions are of low to moderate stringency. 
These conditions favour specific interactions between completely complementary 
sequences, but allow some non-specific interaction between less than perfectly 
matched sequences to occur as well. After hybridization, the nucleic acids can be 
"washed" under moderate or high conditions of stringency to dissociate duplexes 
10 that are bound together by some non-specific interaction (the nucleic acids that form 
these duplexes are thus not completely complementary). 

As is known in the art, the optimal conditions for washing are determined empiri- 
cally, often by gradually increasing the stringency. The parameters that can be 

15 changed to affect stringency include, primarily, temperature and salt concentration. 
In general, the lower the salt concentration and the higher the temperature, the 
higher the stringency. Washing can be initiated at a low temperature (for example, 
room temperature) using a solution containing a salt concentration that Is equivalent 
to or lower than that of the hybridization solution. Subsequent washing can be car- 

20 ried out using progressively warmer solutions having the same salt concentration. 
As alternatives, the salt concentration can be lowered and the temperature main- 
tained in the washing step, or the salt concentration can be lowered and the tem- 
perature increased. Additional parameters can also be altered. For example, use of 
a destabilizing agent, such as formamide, alters the stringency conditions. 

25 

In reactions where nucleic acids are hybridized, the conditions used to achieve a 
given level of stringency will vary. There is not one set of conditions, for example, 
that will allow duplexes to form between all nucleic acids that are 85% Identical to 
one another; hybridization also depends on unique features of each nucleic acid. 
30 The length of the sequence, the composition of the sequence (for example, the 

content of purine-like nucleotides versus the content of pyrimidine-like nucleotides) 
and the type of nucleic acid (for example, DNA or RNA) affect hybridization. An 
additional consideration is whether one of the nucleic acids is immobilized (for ex- 
ample, on a filter). 

35 
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An example of a progression from lower to tiigher stringency conditions is the fol- 
lowing, where the salt content is given as the relative abundance of SSC (a salt so- 
lution containing sodium chloride and sodium citrate; 2X SSC is 10-fold more con- 
centrated than 0.2X SSC). Nucleic acids are hybridized at 42^C in 2X SSC/O.I % 
5 SDS (sodium dodecylsulfate; a detergent) and then washed in 0.2X SSC/0.1% SDS 
at room temperature (for conditions of low stringency); 0.2X SSC/0.1% SDS at 42*C 
(for conditions of moderate stringency); and 0.1X SSC at 68°C (for conditions of 
high stringency). Washing can be carried out using only one of the conditions given, 
or each of the conditions can l>e used (for example, washing for 10-15 minutes each 
10 in the order listed above). Any or all of the washes can be repeated. As mentioned 
above, optimal conditions will vary and can be determined empirically. 

In another aspect a method of reducing tumoregeneicity relates to the use of 
antibodies against an expression product of a cell from the biological tissue. The 
1 5 antibodies may be produced by any suitable method, such as a method comprising 
the steps of 

obtaining expression product(s) from at least one gene said gene being expressed 
as defined above for oncogenes, 

20 

immunising a mammal with said expression product(s) obtaining antibodies against 
the expression product. 

Use 

25 

The methods described above may be used for producing an assay for diagnosing a 
biological condition in animal tissue, or for identification of the origin of a piece of 
tissue. 

30 Furthermore, the invention relates to the use of a peptide as defined above for 
preparation of a pharmaceutical composition for the treatment of a biological 
condition in animal tissue. 



4SOOCtO:<WO 0149879A2 I > 



wo 01/49879 



71 



PCT/DKOO/00744 



Furthermore, the invention relates to the use of a gene as defined above for 
preparation of a pharmaceutical composition for the treatment of a biological 
condition in animal tissue. 

5 Also, the invention relates to the use of a probe as defined above for preparation of 
a pharmaceutical composition for the treatment of a biological condition in animal 
tissue. 

Gene delivery therapy 

10 

The genetic material discussed above for may be any of the described genes or 
functional parts thereof. The constructs may be introduced as a single DNA mole- 
cule encoding all of the genes, or different DNA molecules having one or more 
genes. The constructs may be introduced simultaneously or consecutively, each 
1 5 with the same or different markers. 

The gene may be linked to the complex as such or protected by any suitable system 
normally used for transfection such as viral vectors or artificial viral envelope, lipo- 
somes or micellas, wherein the system is linked to the complex. 

20 

Numerous techniques for introducing DNA into eukaryotic cells are known to the . 
skilled artisan. Often this is done by means of vectors, and often in the form of nu- 
cleic acid encapsidated by a (frequently virus-like) proteinaceous coat. Gene deliv- 
ery systems may be applied to a wide range of clinical as well as experimental ap- 
25 plications. 

Vectors containing useful elements such as selectable and/or amplifiable markers, 
promoter/enhancer elements for expression In mammalian, particulariy human, 
cells, and which may be used to prepare stocks of construct DNAs and for carrying 
30 out transfections are well known in the art. Many are commercially available. 

Various techniques have been developed for modification of target tissue and cells 
in vivo, A number of virus vectors, discussed below, are known which allow trans- 
fection and random integration of the virus into the host. See, for example, Duben- 
35 sky et al. (1984) Proc. Natl. Acad. Sci. USA 81 :7529-7533; Kaneda et al., (1989) 
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Science 243:375-378; Hiebert et al. (1989) Proc. Natl. Acad. Sci. USA 88:3594- 
3598; Hatzoglu et al.. (1990) J. Biol. Chem. 265:17285-17293; Ferry et al. (1991) 
Proc. Natl. Acad. Sci. USA 88:8377-8381. Routes and modes of administering the 
vector include Injection, e.g intravascularly or intramuscularly, inhalation, or other 
5 parenteral administration. 

Advantages of adenovirus vectors for human gene therapy include the fact that re- 
combination is rare, no human malignancies are known to be associated with such 
viruses, the adenovirus genome is double stranded DNA which can be manipulated 
10 to accept foreign genes of up to 7.5 kb in size, and live adeno^^rus is a safe human 
vaccine organisms. 

Another vector which can express the DNA molecule of the present invention, and 
is useful in gene therapy, particulariy in humans, is vaccinia virus, which can be ren- 
15 dered non-replicating (U.S. Pat. Nos. 5.225,336; 5.204.243; 5.155.020; 4.769,330). 

Based on the concept of viral mimicry, artificial viral envelopes (AVE) are designed 
based on the structure and composition of a viral membrane, such as HIV-1 or RSV 
and used to deliver genes into cells in vitro and in vivo. See. for example. U.S. Pat. 

20 No. 5.252.348. Schreier H. et al.. J. Mol. Recognlt, 1995, 8:59-62; Schreier H et al., 
J. Biol. Chem., 1994, 269:9090-9098; Schreier, H., Phanm. Acta Helv. 1994, 68:145- 
159; Chander. R et al. Life Sci.. 1992, 50:481-489, which references are hereby 
incorporated by reference in their entirety. The envelope is preferably produced in a 
two-step dialysis procedure where the "naked" envelope is formed initially, followed 

25 by unidirectional insertion of the viral surface glycoprotein of interest. This process 
and the physical characteristics of the resulting AVE are descrit>ed in detail by 
Chander et al.. (supra). Examples of AVE systems are (a) an AVE containing the 
HtV-1 surface glycoprotein gp160 (Chander et al., supra; Schreier et al.. 1995, su- 
pra) or glycosyl phosphatldylinositol (GPI)-linked gp120 (Schreier et al., 1994. su- 

30 pra), respectively, and (b) an AVE containing the respiratory syncytial virus (RSV) 

attachment (G) and fusion (F) glycoproteins (Stecenko, A. A. et al., Pharm. Pharma- 
col. Lett. 1 :1 27-1 29 (1992)). Thus, vesicles are constructed which mimic the natural 
membranes of enveloped viruses in their ability to bind to and deliver materials to 
cells bearing corresponding surface receptors. 

35 
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AVEs are used to deliver genes both by intravenous injection and by instillation in 
the lungs. For example, AVEs are manufactured to mimic RSV, exhibiting the RSV F 
surface glycoprotein which provides selective entry into epithelial cells. F-AVE are 
loaded with a plasmid coding for the gene of interest, (or a reporter gene such as 
5 CAT not present in mammalian tissue). 

The AVE system described herein in physically and chemically essentially identical 
to the natural virus yet is entirely "artificiar, as it is constructed from phospholipids, 
cholesterol, and recombinant viral surface glycoproteins. Hence, there is no carry- 
10 over of viral genetic information and no danger of inadvertant viral infection. Con- 
struction of the AVEs in two independent steps allows for bulk production of the 
plain lipid envelopes which, in a separate second step, can then be marked with the 
desired viral glycoprotein, also allowing for the preparation of protein cocktail for- 
mulations if desired. 

15 

Another delivery vehicle for use in the present invention are based on the recent 
description of attenuated Shigella as a DNA delivery system (Sizemore, D. R. et al., 
Science 270:299-302 (1995), which reference is incorporated by reference in its 
entirety). This approach exploits the ability of Shigellae to enter epithelial cells and 
20 escape the phagocytic vacuole as a method for delivering the gene construct into 
the cytoplasm of^the target cell. Invasion with as few as one to five bacteria can re- 
sult in expression of the foreign plasmid DNA delivered by these bacteria. 

A preferred type of mediator of nonviral transfection in vitro and in vivo is cationic 
25 (ammonium derivatized) lipids. These positively charged lipids form complexes with 
negatively charged DNA, resulting in DNA charged neutralization and compaction. 
The complexes endocytosed upon association with the cell membrane, and the DNA 
somehow escapes the endosome, gaining access to the cytoplasm. Cationic 
lipid:DNA complexes appear highly stable under normal conditions. Studies of the 
30 cationic lipid DOTAP suggest the complex dissociates when the inner layer of the 
cell membrane is destabilized and anionic lipids from the inner layer displace DNA 
from the cationic lipid. Several cationic lipids are available commercially. Two of 
these, DMRI and DC-cholesterol, have been used in human clinical trials. First gen- 
eration cationic lipids are less efficient than viral vectors. For delivery to lung, any 
35 inflammatory responses accompanying the liposome administration are reduced by 



>DCX:iD: <WO 0149879A2 I > 



I 



wo 01/49879 



74 



PCT/DKOO/00744 



changing the deliveiy mode to aerosol administration which distributes the dose 
more evenly. 

Drug screening 

5 

Genes identified as changing in various stages of colorectal cancer can be used as 
mariners for drug screening. Thus by treating colorectal cancer cells with test 
compounds or extracts, and monitoring the expression of genes identified as 
changing in the progression of colorectal cancers, one can identify compounds or 
10 extracts which change expression of genes to a pattern which is of an eariier stage 
or even of normal colorectal mucosa. 

The following are non-limiting examples illustrating the present invention. 
15 Experimentals 

We have used two different approaches to identify tumor suppressors, oncogenes 
and classifiers. The first approach was based on a spreadsheet approach in which 
we used the fold change and the pattern of expression being present or absent in 
20 the different preparations of RNA. The second approach was based on a 
mathematical approach in which we used correlation to a predefined profile as 
selection criteria based on Pearsons correlation coefficient. 

Examples 

25 

Example 1 

Quantification of gene expression using microanrays 
30 Material 

Colon tumor and normal oral resection edge biopsies were sampled from each 
patient after Informed consent was obtained, and after removal of the necessary 
amount of tissue for routine pathological examination. Number of Tissue examined 
was: Nomial resection edge 6. Dukes A, 5; B, 6; C, 6; D,4. The six normal tissue 
35 samples were all from Dukes A individuals. 
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RNA from Different tumors of the same stage were combined ta form each pool. 
Five isuch pools were prepared as Normal pool. Dukes A pool. Dukes B pool, Dukes 
C pool, Dukes D pool. All tumors and normal tissue specimens were from the 
sigmoid or upper rectum. 

5 

Preparation of mRNA 

Total mRNA was isolated using the RNAzol B RNA isolation method (WAK-Chemie 
Medical GMBH). Poly (A) + RNA was isolated by an oligo-dT selection step 
1 0 (Oligotex mRNA kit from Qiagen). 

Preparation ofcRNA 

One ^g mRNA was used as starting material for the cDNA preparation. The first and 
15 second strand cDNA synthesis was performed using the Superscript Choice System 
(Life Technologies) according to the manufacturer's instructions, except that an 
oligo-dT primer containing a T7 RNA polymerase promoter site was used. Labeled 
cRNA was prepared using the MEGAscript In Vitro Transcription kit (Ambion). Biotin 
labeled OTP and UTP (Enzo) was used in the reaction together with unlabeled 
20 NTP's. Following the IVT reaction, the unincorporated nucleotides were removed 
using RNeasy columns (Qiagen). 

Array hybridization and scanning 

25 Ten jig of cRNA was fragmented at 94°C for 35 min. In a fragmentation buffer 
containing 40 mM Tris-acetate pH 8.1, 100 mM KOAc, 30 mM MgOAc. Prior to 
hybridization, the fragmented cRNA in a 6xSSPE-T hybridization buffer (1 M NaCL, 
10 mM Tris pH 7.6, 0.005% Triton) was heated to 95 "^C for 5 min. And subsequently 
to 40**C for 5 min. Before loading onto an Affymetrix probe array cartridge. The 

30 probe array was then incubated for 16 h at 40 ^'C at constant rotation (60 rpm). The 
washing and staining procedure was performed in the Affymetrix Fluidics Station. 
The probe array was exposed to 10 washes in 6X SSPE-T at 25**C followed by 4 
washes in 0.5xSSPE-T at 50**C. The biotinylated cRNA was stained with a 
streptavidin-phycoerythrin conjugate, 10 |xg/ml (Molecular Probes, Eugene, OR) In 

35 6XSSPE-T for 30 min. at 25^C followed by 10 washes in 6xSSPE-T at 25X. The 
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prove arrays were scanned at 560 nm using a confocal laser scanning microscope 
with an argon ion laser as the excitation source (made for Affymetrix by Molecular 
Dynamics). Following this scan, the array was incubated with an anti-aSndin antibody 
and an biotinylated anti-lmmunoglobulin, and the streptavidin-phycoerythrin step 
5 was repeated. 

The readings from the quantitative scanning were analyzed by the Affymetrix Gene 
Expression Analysis Software. 

Normalization of data 

10 

To compare samples, normalization of the data was necessary. For that purpose we 
compared scaling to total GAPDH intensity (sum of 3', middle, 5* probe sets) of 7000 
units witii scaling to a total anay intensity (global scaling) of 281850 units (averaging 
150 units per pxdbe set). Both gave similar results with scaling factors that differed less 
15 than ten percent in a set of experiments. Based on this we chose the global scaling for 
all experiments. 

Example 2 

Change of transcript level during the progression of colon cancer 

20 

Biopsies from human colon tumors were analyzed as pools of tumors representing 
the different stages in the progression of the colon cancer disease. A total of 4 tumor 
pools were used, each pool made by combining four to six tumors (see materials 
and methods). To generate a normal reference material, we pooled biopsiesfrom 
25 normal colon mucosa from six volunteers. 

From the biopsies RNA was extracted, reverse transcribed to cDNA and the cDNA 
transcribed into labelled cRNA, that was incubated on the an^ay cartridges followed 
by scanning and scaling to a global array intensity amounting to 150 units per probe 
30 set. The scaling made it possible to compare individual experiments to each other. 
To verify the reproducibility, double determinations were made in selected cases 
and showed a good correlation. 

The software GeneArray Analysis Suite 3.1 from Affymetrix. Inc. Was used to ana- 
35 lyse the array data. In this software, increased levels indicate that the transcript is 
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either up-regulated at the stated level or turned on de novo reaching a given fold 
above the background level. Decreased levels in a similar way indicate reduction or 
loss of transcript. Alterations of a single transcript during the progression of the co- 
lon cancer disease can follow several different pathways . Some of the transcript 
5 changes reflect the transition from normal cells to tumor cells, Others an increase in 
malignancy from Dukes A to Dukes B. 

Example 2 

10 A. Finding Classifiers of and predictors etc. of colorectal cancer based on a 
spreadsheat approach. 

We used a spreadsheat to sort genes based on different parameters obtained from 
the Affymetrix analysis software. 

15 

The mRNA expression analysis on the AFFYMETRIX ARRAYS resulted in 42.843 
datasets identifying individual genes (table I) or EST's (table ll),altogether. These 
were obtained from the 6.8k Arrays ( 7.129 datasets) and the EST ARRAYs 
(35.714 datasets) 

20 

Description of the Sorting Procedure for the spreadsheat sorting. 
Per dataset the following was listed. 

Probe Set No., Present or absent in Normal tissue or the different Duke's types, 
25 gene name or homoogy or number, "AvgDifT which is the level of expression, "Abs 
Cair which determines if the gene is present (P) or absent (A) , "Diff call" which de- 
termines the alteration as increasing ( I) or decreasing (D), "fold change" the fold 
change from normal tissue expression level,, and the "sort score" which determines 
the likelihood that it is real changes ( if above 0.5). 

30 

The following steps were performed, 

1 . exclude data if "Probe Set" is an AFFX-marker (58/array or sub-array) 

2. exclude data If "Diff Call" in all 4 comparisons is "NC" (no change) 

3. exclude data if "Abs Call" in all 4 comparisons is "A" (absent) 
35 4. exclude data if three "Abs call" are "NC" and one is "Ml or MD" 

5. select data with absolute value of I sort score I arbitrarily set to >= 0,5 
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( At this Step the sorting resulted In the following number of genes sorted as be- 
ing of Importance, 908 Genes (12.7%) and 4155 ESTs (11,6%) 
6. sort according to pattern of Abs Calls (e.g. PAAAA = lost from N to tumour Duke 

ABCD) 

5 7. select data with Avg Diff of >= 300 (500 for some ESTs) and /or fold change >= 
3 (>= 5 for some ESTs) 

Number of genes sorted out as being of interest after this final sorting, » 130 
Genes (1.8%), « 240 ESTs (0,7%) 

10 

The following tables show the genes (Table I) and EST'+s (Table II) that were iden- 
tified by this approach, analyzing the hu 6.8K H gene anray. First a list of the poten- 
tial tumor suppressors, then a list of the potential oncogenes, finally a list of genes 
that can be used to classify the different Dukes Stages. Genes that are in bold are 
1 5 those that we find are of the utmost interest. 

The table (Table III) that follow this section are based on the hu EST arrays Hu35k 
Sub A.B.C.D. These are also divided Into EST's that are supposed to be expressed 
from tumor suppressors, and oncogenes, as well as from genes that can be used 
20 as classifiers of the different Dukes stages. The most intersting Est's are shown in 
bold. 
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Table I 

FoW Change in comparison to noimal 



SUPPRESSOR CLASSIFIER 



Gene name |Acc No 


Avg Diff 


Avg Diff 


[CBGicl^lfTengenes tost PAJ^ or ?PAAA: ; V- . . - " v b 


NT: 


A- \B ■: 



"Human chromogranin A ""mRNA/'" complete cds' 
Human adipsin/complement factor D **mRNA,'* com- 
plete cds 

Homo sapiens MLC-IV/Sb isofonm gene 

Human amlnopeptldase N/C013 mRNA encoding 

aminopeptldase "N,** complete cds 

H^aplens MT-11 mRNA 

H.sapiens GCAP-II gene 

Human somatostatin I gene and flanks 

Human YMP "mRNA," complete cds 

H.sapiens mRNA for beta subunit of epithefial amiloride- 

sensitive sodium channel 

Human K12 protein precursor "^RNA." complete cds 
Human sulfate transpofter (DTD) "mRNA," complete cds 
Human transcription foctor hGATA-6 "YnRNA * complete 
cds. 

H.sapiens SCAD "gene." exon 1 and joining features 

Human S-lac lectin L-14-II (LGALS2) gene 

Human mRNA for protein tyrosine phosphatase 

H.sapiens mRNA for tetranectin 

Human 1 1kd protein **mRNA,- complete cds 

Human anti^muUerian hormone type 11 receptor precursor 

"gene." complete cds 

Human heparin binding protein (HBp17) "mRNA." com- 
plete cds 

Human ADP-nlKtsylation factor (hARF6) "mRNA," com- 
plete cds 

beta -ADD=addudn fc>eta subunit 63 Kda iso- 
form/membrane skeleton protein, beta -ADO=adducin 
beta subunit 63 kda isoform/membrane skeleton protein 
(alternatively spliced, exon 10 to 13 region} [human. 
Genomic. 1851 nt, segment 3 of 3]. 
Zinc Rnger Protein Znf165 

Human glucagon "mRNA," complete cds 

H.sapiens mRNA for hair "keratin " hHbS 

Human tubulin-foMtng cofector E "mRNA.** complete cds 

Human integrin alpha-3 chain "mRNA." complete cds 

Human NACP gene 

H.sapiens mRNA for flavin-containing monooxygenase 5 
(FM05) 

Human mRNA for ATF-a transcription factor 
H.sapiens intestinal VIP receptor related protein mRNA 



J0391S 


831 


lost 


M84526 


822 


tost 


M24248 


799 


tost 


M22324 


657 


tost 


X76717 


650 


tost 


Z7029S 


572 


tost 


J00306 


516 


tost 


U52101 


459 


tost 


X87159 


439 


tost 


U77643 


429 


121 


U14528 


397 


tost 


U66075 


337 


tost 


Z80345 


326 


lost 


M87860 


301 


tost 


D15049 


277 


43 


X64559 


235 


lost 


U28249 


233 


47 


U29700 


223 


lost 


M60047 


218 


lost 


M57763 


209 


lost 


S81083 


188 


lost 


HG4243- 


186 


tost 


HT4613 






J04040 


182 


25 


X99140 


158 


tost 


U61232 


ISO 


lost 


M59911 


126 


lost 


U46901 


123 


tost 


Z47553 


110 


lost 


X52943 


104 


tost 


X77777 


93 


lost 



lost 
lost 

lost 
lost 

tost 
lost 
tost 
lost 
lost 

tost 
lost 
lost 

lost 
lost 
lost 
tost 
lost 
lost 

lost 

lost 

tost 



lost 

lost 
lost 
lost 
lost 
lost 
lost 

lost 
lost 



|Gene name 


Acc No 


Avg Diff 


fold change to N 


fcfnt^ifl^Glasifft^c^ 




^^^^^^^^^^ 







Homo sapiens SKB1 Hs "mRNA." complete cds. 
/gb=AF01591 3 /ntype=RNA 
Mucin (Gb:M22406) 

Human platelet activating factor "acetylhydrolase." brain 
"isofomi," 45 kOa subunit (LISI) gene 
Homosapiens ERK activator kinase (MEK2) mRNA 
Human 20-kDa myosin light chain (MLC-2) **mRNA.*' 
complete cds 

H.sapiens lysosomal acid phosphatase gene (EC 

3.1.3.2) Exon 1 (and joined COS). 

Human mRNA for matrix Gla protein 

H.sapiens mRNA for diacylgfycerol kinase 

Human heat shock protein (hsp 70) gene, complete cds. 

Human TRPM-2 protein gene 



AF015913 



HG1067- 

HT1067 

U72342 

L1 1285 
J02854 

XI 5525 

XS3331 
X62535 
M11717 
M63379 



Human gene for mitochondrial acetoacetyt-CoA thiolase D1 051 1 
Human mRNA for transcription factor "AREB6." complete 01 5050 



188 


Lost 


501 


Lost 


114 


Lost 


1470 


-5,2 


2047 


-4,5 


285 


-4.4 


1069 


-4.2 


362 


-3,5 


405 


-3.2 


1594 


-3 





198 tost 
232 lost 
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cds 

Human mRNA for KIAA0248 "gene." partial cds 
Homo sapiens (done CC6) h4ADH-ubiquinone oxldore- 
ductase subunit "mRNA," 3' end cds 
Human phosphoglucomutase 1 (PGM1) '*mRNA,** 
complete cds 

Homo sapiens guanylin -mRNA," complete cds 
"Human trans-Golgi p230 ""mRNA."" complete cds" 
Hsapiens mRNA for vacuolar proton "ATPase." subunit 
O 

H.sapiens mRNA for 3^ydroxy-3-methylgliitaryl 
coenzyme A synthase 

Human mRNA for KIAA0018 "gene," complete cds 
"Mucin ""1 ."" ""Epithelial."" Alt Splice 9" 

H.sapiens mRNA for L-3-hydroxyacyl-CoA dehydrogena- 
se 



i^lasslfieS 

Homo sapiens colon mucosa-assoclated (DRA) 
"mRNA,- complete cds 
Human Ig J chain gene 

Human selenium-binding protein (hSBP) "mRNA /' 

complete cds. /gb=U29091 Aitype^RNA 

H.sapiens mRNA for sigma 3B protein 

Human ERK1 mRNA for protein serine/threonine 

kinase 

Human mRNA for mitochondrial 3-oxoacyl-CoA "thi- 

olase," complete cds 

"Biliary ""Glycoprotein,"" Alt. Spfice ""5,"" A" 

Human AQP3 gene for aquaporine 3 (water "channel)," 
partait cds 

Human CD14 mRNA for myelkJ cell-specific leucine-rich 
glycoprotein 

Human thioredoxin "mRNA." nuclear gene encoding 

mitochondrial "protein." complete cds 

Human mitochondrial ATPase coupling factor 6 subunit 

(ATP5A) "mRNA." complete cds 

-Human MHC class II HLA-DP light c^ain ""mRNA."" 

complete cds" 

Human mRNA for eariy growth response protein 1 
(hEGRI) 

Human mRNA for mitochondrial 3-ketoacyl-CoA thiolase 
beta-subuntt of trifunctional "protein." complete cds 
Homo sapiens laminin-retated protein (LamA3) 'YnRNA," 
complete cds 

H.sapiens mRNA for selenoprotein P 

Human hkf-1 "mRNA." complete cds 

Homo sapiens nuclear domain 10 protein (ndp52) 

"mRNA." complete cds 

Human XI 04 "mRNA." complete cds 

H- sapiens cONA for RFG 

H.sapiens mRNA for Progression Associated Protein 
Human liver "2,4-dienoyl-CoA" reductase "mRNA." com- 
plete cds 

Human A33 antigen precursor "mRNA " complete 
cds 

H^sapiens pS2 protein gene 
Human RASF-A PLA2 "mRNA," complete cds 
Homo sapiens psti mRNA for pancreatic secretory inhi- 
bitor (expressed in neoplastic tissue). 
Human (XM)29 



087435 
L04490 




374 lost 
683 lost 


M830a8 




1096 lost 


M97496 
U41740 
A7149U 




4983 lost 
131 lost 
414 lost 


X83618 




2198 lost 


D13643 
HG371. 
HT26388 
X96752 




377 -7,7 
3296 -4.1 

252 -3 








L02785 




2978 tost 


M12759 
U29091 




2193 Lost 
1849 Lost 


X99459 
X60188 




722 Lost 
576 Lost 


D16294 




529 Lost 


HG2850- 

HT4d14 

AB001325 




489 Lost 
413 Lost 


XI 3334 




413 Lost 


U78678 




411 Lost 


M37104 




373 Lost 


M57466 




327 Lost 


X52541 




281 Lost 


D16481 




268 Lost 


L34155 




252 Lost 






232 Lost 
211 Lost 
150 Lost 


L27476 
X77548 
Y07909 
U493S2 




149 Lost 
130 Lost 
128 Lost 
1U1 LOSC 


U79726 




1650 -6,9 


X52003 
M22430 
Y00705 




4298 -6 
4983 -5.8 
344-3.1 


M35252 




3500 -3 








K02765 




744 lost 


Z69881 




439 lost 


U60115 




281 lost 


M21574 




187 lost 



Human complement component C3 "mRNA," alpha 

and t>eta "subunits." complete cds 

H^piens mRNA for adenosine "triphosphatase," 

calcium 

Human skeletal musde LIM-prolein SLIM1 "mRNA." 
complete cds 

Human platelet-derived growth factor receptor alpha 
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(PDGFRA) "mRNA," complete cds 

Human mRNA for KIAA0247 "gene." complete cds D87434 1 72 lost 

Human mRNA for KIAA01 71 "gene." complete cds D79993 151 lost 

Human Down syndrome critical region protein (DSCR1) U28833 150 lost 
"mRNA," complete cds . 

Human Ki nuclear autoantigen "mRNA," complete cds U1 1 292 1 25 lost 

Homo sapiens chromosome 16 BAC clone CrT987SK- AF001548 3513 -3.6 -4,3 
615A9 complete sequence. 

Human mRNA for ATP synthase alpha "subunit," com- D14710 3580 -3,8 -5.6 
plete cds 



Human mRNA for IgG Fc binding "protein,** complete D84239 3755 -19,3 -7,1 
cds 

H. sapiens mRNA for carcinoembryonic ""antigen.** X98311 2456 -12 *6 5 
CGM2 

**Homo sapiens (clone iamda-hPEC-3) phosphoenol- L05144 2630 -7,6 -14,7 

pyruvate cart>oxyklnase (PCK1) ""mRNA,"" complete 

cds" 

Human 11 <beta-hydroxysteroid dehydrogenase type 2 U26726 1865 -7.1 -4.7 
^RIMA." complete cds 

"Human intestinal mucin (MUC2) ""mRNA."** complete 121 998 7803 -5,5 -4.2 
cds" 

Human mRNA for KIAA0106 "gene." complete cds D14662 766 -4,7 -3.2 

metailothionein V00594 5417 -4 -6.3 



Table i(cont.) 

Fold Change in comparison to normal 



Oncogene CLASSIFIER 



Gene name 


lAcc No 


Avg 
DIff 


Avg DIff 

















Homo sapiens (clones "MOP4," MOP7) microsomal 
dipeptidase (IMDP) "mRNA," complete cds 
"Homo sapiens reg gene *"'homologue,"** complete 
cds" 

H.saplens mRNA for prepro-alpha2(l) collagen 
"Human S-adenosythomocysteine hydrolase (AHCY) 
""mRNA."" complete cds" 
Transcription Factor Ilia 

Human gene for melanoma growth stimulatory activi- 
ty (MGSA) 

Human stromelysln-3 mRNA 

COC25Hu2=cdc25+ homolog "{human." "^RNA." 3118 
ntl 

Human mRNA for cripto protein 

Human transformation-sensitive protein (lEF SSP 

3521) "mRNA,** complete cds 

Human complement component 2 (C2) gene allele b 

H.sapiens mRrJA for ITBA2 protein 

H.sapiens encoding CLA-1 mRNA 

"Human fibroblast growth factor receptor 4 (FGFR4) 

""mRNA,"" complete cds" 

"*"*Fibronectin,"" Alt. Splice 1" 

tyk2 

Human mRNA for B-myb gene 

"Human phosphofructokinase (PFKM) ""mRNA."" com- 
plete cds" 

Human pre-B ceil enhancing factor (PBEF) "mRNA," 
complete cds 

Human SH2-containin9 inositol 5-phosphatase (hSHIP) 



J05257 


1606 


1403 


gained 


L08010 


1165 


294 


gained 


Z74616 


1003 


905 


gained 


M6ia32 


882 


817 


gained 


HG4312- 


837 


948 


gained 


HT4582 






XS4489 


731 


330 


gained 


X57766 


643 


1116 


gained 


S78187 


603 


627 


gained 


X14253 


532 


293 


gained 


M86752 


529 


866 


gained 


L09708 


515 


625 


gained 


X92896 


444 


459 


gained 


Z22555 


422 


549 


gained 


L03840 


359 


276 


gained 


HG3044- 


354 


261 


gained 


HT3742 






X54667 


336 


352 


gained 


X 13293 


333 


322 


gained 


U24183 


296 


426 


gained 


U02020 


276 


242 


gained 


U57650 


254 


315 


gained 



SUBSTITUTE SHEET (RULE 26) 



wo 01/49879 



82 



PCT/DKOO/00744 



"jmRNA," complete cds 

Human interieukin 8 (IL8) ''gene," complete cds M28130 

"Human lamin B receptor (LBR) '*"mRNA,'~ complete L25931 
cds- 

Rsaptens mRNA for protein tyrosine phosphatase Z48541 

Human mRNA for unc-1 8 "homotogue." complete cds 063851 

H.sapiens mRNA for Zn-alpha2-glycoprotein X59766 

Z25521 

"Human asparagine synthetase "^mRNA,"" complete cds** M27396 

Human hepatitis delta antigen Interacting protein A (dipA) U63825 
-mRNA." complete cds 

Human spficesomal protein (SAP 61 ) "mRNA," complete U0881 5 
cds 

Human protein kinase C-binding protein RACK7 U48251 
"mRNA," partial cds 

Human MAC30 "mRNA." 3* end LI 91 83 

Human thromt>ospondin 2 (THBS2) "mRNA," complete LI 2350 

cds 

"Human nicotinamide N^ethyltransferase (NNMT) U08021 
"^mRNA.*"* complete cds" 

H.sapiens mRNA for type I interstitial coUagenase X54925 

Human cytochrome b561 gene U29463 

Human HI 9 RNA "gene/* complete cds (spliced In M32053 
silico) 

Human collagen type XVItl alpha 1 (COL18A1) "mRNA." L22548 
partial cds 

Human clone 23733 "mRNA." complete cds. U79274 



251 


609 


gained 


239 


193 


gained 




151 


gained 


217 


198 


gained 


215 


156 


gained 


215 


127 


won 


212 


195 


gained 


21 1 


231 




157 


201 


gained 








1 £0 






111 


126 


gained 


107 


261 


gained 


105 


123 


gained 


85 


85 


gained 


72 


4498 


gained 


67 


275 


gained 


absent 


162 


gained 



Gene name 


Acc No 


Avg 
Difff 


fold change to N * 










Human migration inhibitory fector-related protein 8 
(MRP8) "gene," complete cds 


M21005 


120 GAINED 


Human acyloxyacyl hydrolase "mRNA." complete cds 


M62840 


130 GAINED 


Human PEP19 (PCP4) "mRNA." complete cds 


U52969 


174 GAINED 


H.sapiens Humig mRNA 


X72755 


118 GAINED 


H.sapiens PISSLRE mRNA 


X78342 


125 GAINED 


H.sapiens mRNA for twist "protein," partial. /gb=Yl 1 180 
/ntype=RNA 


Y11180 


121 


GAINED 


Human mRNA for TGF-beta superfamily "protein." com- 
plete cds 


AB000584 


1372 


3.5 




Human mRNA for "MSS1." complete cds 


D11094 


292 


3.1 




Human complement factor B "mRNA," complete cds 


LI 5702 


2082 


3.3 




"Homo sapiens OTP-binding protein (RAB2) ""mRNA,"" 
complete cds" 


M28213 


289 


3.1 




Human translational initiation factor 2 beta subunit (elF-2- 
beta) "mRNA," complete cds 


M29536 


956 


4.1 




Human El 6 "mRNA," complete cds 


M80244 


278 


3,8 




IEX-1=radiation-inducible immediate-earty gene "[hu- 
man," "placenta." mRNA "Partial." 1223 nt] 


S81914 


1531 


3,6 




Human CDCISHs "mRNA." complete cds 


U 18291 


244 


6.1 




Human DD96 "mRNA," complete cds 


U21049 


625 


3.2 




Human (memc) "mRNA." 3'UTR. /gb=U30999 
/ntype=RNA 


U30999 


256 


3.8 




"Human ubiquitin-conjugating enzyme (UBE2I) 
""mRNA,*"' complete cds" 


U45328 


448 


10.6 




"Human fetal brain glycogen phosphorylase B ""mRNA,"" 
complete cds" 


U47025 


2349 


3.7 




"Human BTG2 (BTG2) nmRNA."" complete cds" 


U72649 


527 


5.2 




Human jun-B mRNA for JUN-B protein 


X51345 


1350 


4.6 





fQnijii^aassifl^r-^^^^^^' v::;;,^!^ 








Human adipocyte Mpid-binding "protein," complete cds 


J02874 


268 


GAINED 


Human A1 protein "mRNA." complete cds 


U29680 


102 


GAINED 


Human LGN protein "mRNA." complete cds 


U5499g 


110 


GAINED 


Human skeletal musde LIM-protein SLIM2 "mRNA," 
partial cds 


U60116 


109 


GAINED 


Human mRNA for alphal-add glycoprotein (orosomuco- 


X02544 


156 


GAINED 


Human mRNA for fibronectin receptor alpha subunit 


X06256 


46 


GAINED 
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H.sapiens P1-Cdc21 mRf4A 


X74794 


278 


GAINED 


H.sapiens mRNA for fibuIin-2 


X82494 


284 


GAINED 


H.sapiens 5T4 gene for 5T4 Oncofetal antigen 


Z29083 


152 


GAINED 


Homo sapiens mRNA for osteoblast specific factor 2 
(0SF.20S) 


013666 


324 






Mac25 


HG987-HT987 


2772 




**Human lysozyme "^mRNA,**" complete cds with an Alu 
repeat in the 3* flank" 


J03801 


920 


3,7 


Human metailoproteinase (HME) "mRNA,** complete cds 


L23808 


794 


7,4 


Human alnha^l ttillaa^n hme IV aene exon S2 


M26576 


610 




Human lumican *VnRNA.' complete cds 


U21128 


1105 


A 1 


numan mr\iMM lor nDionecun \rri precuiaoi/ 




A1 A1 
*t 1 0 1 


5.5 


rf umsn mr^nM uaymeni roi cioiiijauofi iomui i w 
temiinus). /gb=X03689 /ntype=RNA 




1 >J 


3,1 


Human mRNA for type IV collagen alpha -2 chain 


X05610 


1531 


3 


Human mRNA for collagen Vl afpha-l C-temnlnal globu- 
lar domain 


XI 5880 


2062 


3.5 


"H.sapiens." gene for Membrane cofactor protein 


X59405 


272 


3,4 


H.sapiens SOO-2 gene for manganese superoxide dis- 
mutase. /gb=X65965 /ntype^DNA /annot^exon 


X65965 


234 


3.1 


H.sapiens NMB mRNA 


X76534 


338 


3.3 


H.sapiens vimentin gene 


Z19554 


3472 


3.2 











Ribosomal Protein L39 Homolog 


HG2874- 

MToOlO 


102 


GAINED 


: ;-; ^ ^ — 

Homo sapiens (clone d2-1 1 5) kappa opioid receptor 

r\rx 1 } mr\rMr\, complete iajo 




loo 


GAINED 


numon kcii diuuu yiuup iJiuidii iiirviv^ 


M64g34 


143 


GAINED 




U73167 


374 


GAINED 


protease with IGF-binding "motif." complete cds 






3.4 




Human inf£»rf^rnn.tnHiiohilA nrntpin ?7-^pn "mRNA " 


J04164 


7717 


3.8 


'l-luman sickle cell beta-globin ""mRNA."" complete cds" 


M25079 


3090 


4.6 




M29277 


1588 


3.7 


"Human spermidine synthase ""mRNA/"* complete cds" 


M34338 


866 


4.1 


Human copine 1 "mRNA." complete cds 


U83246 


2079 


3.7 












gQnly:!P^rGiasstfierli^^^^^^^p^^^^^^^ 








Homo sapiens FRG1 "mRNA,** complete cds 


L76159 


73 


GAINED 


Human cyclin protein "gene," complete cds 


Ml 5796 


149 


GAINED 


Human U2 small nuclear RNA-associated B" antigen 
"mRNA." complete cds 


M15841 


194 


GAINED 


Human mRNA export protein Rael (RAE1) "mRNA." 
complete cds. 


U84720 


193 


GAINED 


Human protease-activated receptor 3 (PARS) "mRNA." 
complete cds. 


U92971 


142 


GAINED 


H.sapiens mRNA for mediator of receptor-induced toxi- 
city 


X84709 


200 


GAINED 


H.sapiens RFXAP mRNA 


Y12812 


230 


GAINED 


Human mRNA for "Qipl." complete cds 


AB002533 


8881 






Human mRNA for transfemn receptor 


X01060 


557 


3 


"metastasis-associated gene ""{human,'"* highly metasta- 
tic lung cell subline ""AnipI9371."" mRNA "Partial,"" 978 
ntr 


S79219 


216 


4 




















Human chaperonln 1 0 "mRNA." complete cds 


U07550 


50 


4,1 


3.3 


H.sapiens RING4 cDNA 


X67522 


73 


4.9 


5.4 


H.sapiens genes TAPl. TAP2. LMP2. LMP7 and DOB. 


X66401 


134 


3,2 




H.sapiens mRNA for alpha 4 protein 


Y08915 


96 


3,7 


3.6 


Homo sapiens interleukin-1 receptor-associated kinase 
(IRAK) "mRNA." complete cds 


L76191 


285 


3.1 


3.1 


"Human von Willebrand factor '"mRNA,"" 3* end" 


M10321 


84 


3,7 


4.1 


Human chromosome segregation gene homolog CAS 
"mRNA." complete cds 


U33286 


86 


4.8 


3.6 


Human Bruton's tyrosine kinase-assodated protein-135 


U77948 


68 


3.4 


4,9 
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"mRNA." comolete cds. 










"Human KH type spficing regulatoty protein KSRP 
""mRNA."" complete cds." 


U94832 


52 


3.2 


3.2 


H.sapiens ADE2H1 mRNA showing homologies to SAJ- 
CAR synthetase and AIR cart>oxytase of the purine 
pathway (EC -6.3.2.6 " EC 4.1.1.21) 


X53793 


40 


3 


3.1 



flC ClaisslflSr: . -1^. 








"^Globin,"" Beta" 


HG1428- 
HT1428 


504 


3.1 


4.3 


"Human alpha-1 collagen type 1 ""gene."" 3' end" 


M55998 


2706 


3.1 


3.7 


H.sapiens mRNA for SOX-4 protein 


X70683 


130 


4.5 


4.5 


"Human mRNA for collagen binding protein ""2."" com- 
plete cds" 


D83174 


131 


8.1 


6.1 


Human SPARC/osteonectin "mRNA." complete cds 


J03040 


358 


6.1 


3.9 


Human PRA01 mRNA for cyctirs 


X5979a 


263 


3.3 


3.4 















Human transfomiing growth fiactor-t)eta induced gene 
product (BIGH3) "mRNA." complete cds 


M77349 


426 
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B. Finding potential classifier genes for colorectal cancer (Dukes A,B^CdD) 
by sorting according to Pearson coneiation coefficient 

Primary selecUon criteria for classifier genes: 

5 

1. All genes with a score of A (AbsCatl) or NC (OiffCatl) for all groups (N, A, B. C & O) were removed. 

2. Genes with a fold change below 5 and a Sort Score below 0.5 were removed. 

3. >f DfffCafl were NC for a gene in a particular experiment the PC were set to 1 . 

1 0 Secondary selection criteria for classifier genes: 

Based on Pearson correlation coefficient (figure 1) genes similar to a predefined profile were selected. 

15 

r = . - 

J[/25jr^- (2Jir)^][«2:r^- (27)^ ] 

25 



30 



Figure 1: Pearson correlation coefficient (/) 
Classifier oenes for Dukes A. B. C and D: 



Table III 



A classifiers (Profile 1. Q. 0. 0). Pearson correlations approach 



35 



D87444_at 
U18291_at 
L76568_xpt3_f_at 

U45328_s_at 
214982_ma1_at 

AD000092_cds7_s_at 

086973_at 

X81636_at 

M59916.at 

X85781_s_at 

M57731_s_at 

U49188„at 

X53800_s.at 

U56816_at 

HG1067-HT1067 r at 



Human mRNA for K1AA0255 "gene/ complete cds 
Human COC16Hs "mRNA." complete cds 

S26 from Homo sapiens excision and cross link repair protein (ERCC4) -gene " 
complete genomic sequence. /gb=L76568 /ntype=DNA /annot=exon 
"Human ubiquitin-conjugating enzyme (UBE2I) ""mRNA,"" complete cds" 

H.sapiens gene for major histocompatibility complex encoded proteasome subunit 
LMP7. 

RAD23A gene (human RAD23A homolog) extracted from Homo sapiens ONA from 
chromosome 19p13.2 cosmids "R31240." R30272 and R28549 containing the 
"EKLF," "GCDH," "CRTC," and RAD23A "genes " genomic sequence 
Human mRNA for KtAA0219 "gene." partial cds 

H.sapiens clathrin tight chain a gene 

Human acid sphingomyeftnase (ASM) **mRNA," complete cds 

"H.sapiens NOS2 ""gene,"" exon 27 /gl>=X85781 /ntype=DNA /annot^^exon" 

"Human gro-beta ""mRNA."" complete cds" 

Human placenta (Oiff33) "mRNA." complete cds 

Human mRNA for macrophage inflammatory protein-2t>eta (MIP2beta) 

Human kinase Mytl (Mytl) "mRNA." complete cds. 

Mucin (Gb:M22406) 



EST: 

RC F03077 f 



Chromosome 17. clone hRPC.15 
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RC_AA599199 

RC_AA207015 

RC_AA234916 

RC_N92239_a 

RcIn93958.s 

U95301_at 

RC_AA426330 

RC_AA024658 

RC_H88540_a 



Alu seq 

clone RP4-733M16 on chromosome 1p36.11 -36.23 
Chromosome 19 clone CTC-461H2 
Wnt inhibitory factor-l (WIF-1), chromosome 12 
Phospholipase A2. group X (PLA2G10). 
Phospholtpase A2. group X (PLA2G10), 
Chromosome 1 7. clone hRPC. Ill OlE_20 
done SCb-254N2 (UWGC:rg254N02) from 6p21 
heat shock protein 90. 1q21.2-q22 



B classifiers f Profile 0. 1. 0. 0) 



Hu6800: 

U57316_at 

X66839_at 

J04599_at 

X57579_s_at 

J02874_at 

M11749_at 

U06863_al 

U51010_s_at 

U08021_at 

HG3044-HT3742_s_at 

X02761_s_at 

X02544_at 

M62505.at 

J05070_at 

U16306_at 

M14218_at 

L77567„s_at 

M63391_ma1.at 

D13643_at 

079985_at 



Human GCN5 (hGCN5) "gene." complete cds 
H.sapiens MaTu MN mRNA for p54/58N protein 

Human hPGI mRNA encoding bone small proteoglycan I "(biglycan)." complete cds 

H.sapiens activin beta-A subunit (exon 2) 

Human adipocyte lipid-binding "protein." complete cds 

Human Thy-1 glycoprotein "gene." complete cds 

Human follistatin-related protein precursor "mRNA." comfrfele cds 

"Human nicotinamide N-methyttransferase ""gene,"" exon 1 and 5' flanking region. 
/gb=U51 01 0 /ntype=DNA /annot=exon" 

"Human nicotinamide N-methyltransferase (NNMT) ""mRNA."" complete cds" 

"""Fibronectin."- Alt Spfice 1" 

Human mRNA for fibronectin (FN precursor) 

Human mRNA for alpha l-ackl glycoprotein (orosomucoid) 

Human C5a anaphylatoxin receptor "mRNA." complete cds 

Human type IV collagenase "mRNA." complete cds 

Human chondroitin sulfate proteoglycan versican VO splice-variant precursor peptide 
"mRNA," complete cds 

Human argintnosuccinate lyase "mRNA," complete cds 

"Homo sapiens mitochondrial citrate transport protein (CTP) ""mRNA."" 3* end" 

Human desmin gene, complete cds. 

Human mRNA for KIAA001 8 "gene." complete cds 

Human mRNA for K1AA0163 "gene." complete cds 



EST: 

M63262.at 

R67290_at 

N36619_at 

L19161.at 

RC_AA496035 

L29217_s_at 

RC_W73194_a 

RC_N69507_a 

RC_H15814_s 

M84526 at 



5-lipoxygenase activating protein (FLAP). 13q12 
Interleukine 14 

Translation initiation fector 2. subunit 3", Xp22.2-22.1 
Chromosome l?(nGR) 
COC-like kinase 3 (CLK3). 15q24 
Dermatoponin. 1q12-q23 

Hypothetical protein PR01847 (Alu according to TIGR) 
adipose most abundant gene transcript 1 
O component of complement (adipsin) 



C classifiers (Profile 0. 0. 1. 0) 



10 



Hu6800: 
M20681_at 
D50914_at 
L37362_at 

X66114_ma1_al 
M32053 at 



Human glucose transporter-like protein-Ill "(GLUT3)." complete cds 
Human mRNA for KIAA0124 "gene," partial cds 

Homo sapiens (done d2-11S) kappa opioid receptor (OPRK1) "mRNA." complete 
cds 

H.sapiens gene for 2-oxoglutarate canrier protein. 
Human H19 RNA "gerie," complete cds (spliced in silico) 
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Y00787_s_at 

U64444_at 

X95325_s_al 

X02419_ma1.s_at 

X57522_al 

AB001325_at 

AB002315.at 

L12760„s_at 

M80d99_at 



Human mRNA for MONCF (monocyte-derived neutrophil chemotactic factor) 
Human utnquitin fusion-degradation protein (UFDIL) "mRNA,** compiete cds 
H.sapiens mRNA for DMA binding protein A variant 
H.sapiens uPA gene 
H.sapiens RING4 cONA 

Human AQP3 gene for aquaporine 3 (water "channel) " paftail cds 

Human mRNA for KIAA0317 "gene." complete cds. /gb==AB00231S /ntype-RNA 

"Human phosphoenolpyruvate carboxykinase (PCK1) ""gene."" complete cds with 
repeats" 

Human novel protein AHNAK "tnRNA," partial sequence 



EST: 

RC.AAI 22350 

AA374109_at 

RC_AA621755 

RC_AA442069 

RC_T40767_a 

RC_AA488655 

RC_AA398908 

RC_AA447764 

RC N69136 a 



Chromosome 8 

spondin 2. extracellular matrix protein, chromosome 4 

Transcription factor Dp-2. 3q23 

sodium channel 2, 12q12 

Chromosome 19 

Mus? 

Hypothetical protein, chromosome 4 



D classifiers (Profile 0. 0. 0. 1 ) 



XI 7644. 


-S_at 


Y12812, 




X60486. 


-at 


X52221. 


.at 


L06175. 


at 


Z48481. 


at 


X54232. 


.at 


L08010_ 


at 


L27706_ 


.at 


LI 5533, 


ma1_at 


X51408. 


.at 


K02765. 


.at 


U38g04. 


_at 



Human GSTI-Hs mRNA for GTP-binding protein 

H.sapiens RFXAP mRNA 

H.sapiens H4yg gene for H4 htstone 

H.sapiens ERCC2 "gene." exons 1 & 2 (partial) 

Homo Sapiens P5-1 "mRNA," complete cds 

H.sapiens mRNA for membrane-type matrix metalloproteinase 1 

Human mRNA for heparan sulfate proteaglycan (gtypican) 

"Homo sapiens reg gene ""homologue,"" complete cds" 

Human chaperonin protein (Tcp20) gene complete cds 

Homo sapiens pancreatits^ssoctated protein (PAP) gene, complete cds. 

Human mRNA for n-chimaerin 

Human complement component C3 "mRNA." alpha and t>eta "subunits." complete 
cds 

Human zinc finger protein C2H2-25 "mRNA." complete cds 



EST: 

RC_AA121433 
RC_N91920_a 
RC_AA621601 
RC_AA454020 
Rclz39652_a 



Axin. chromosome 16 

RB protein binding protein, chromosome 16 

GTP-binding protein Rab36. chromosome 17 

NADPH quinone oxtdoreductase homolog; p53 induced, chromosome 2 
APM-1 gene, chromosome 18 



1 0 Conclusion. 

As can be seen from these tables we have identified a number of genes and EST's, based on two different apo- 
roaches, that we believe are either of importance for initiating and developing colorectal cancer, or can be used to 
classify the disease. These genes and EST's are subdivided into potential tumor suppressors that have a reduced 
1 5 level during progression of the disease — or that even completely lose their expression: potential oncogenes that 

increase their level during disease progression . or even are gained de novo, not being expressed at eariy stages or 
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in normal mucosa; and finally classifiers of the disease that can t>e used to identify the different Dukes stages , e.g. 
being only expressed at a certain stage. 
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Claims: 

1. A method of detemiining the presence or absence of a biological condition in 
animal tissue 



comprising collecting a sample comprising ceils from the tissue and/or expres- 
sion products from the cells, 



10 assaying a first expression level of at least one gene from a first gene group, 

wherein the gene from the first gene group is selected from genes expressed in 
normal tissue cells in an amount higher than expression in biological condition 
cells, and/or 



15 assaying a second expression level of at least one gene from a second gene 

group, wherein the second gene group is selected from genes expressed in a 
normal tissue cells in an amount lower than expression in biological condition 
cells, 

20 correlating the first expression level to a standard expression level for normal 

tissue, and/orthe second expression level to a standard expression level for 
biological condition cells to determine the presence or absence of a biological 
condition in the animal tissue. 

25 2. The method of claim 1 . wherein the animal tissue is selected from epithelial tis- 
sue. 



3. The method of claim 2, wherein the animal tissue is selected from epithelial tis- 
sue in the gastro-intestinal tract. 

30 

4. The method of claim 3, wherein the animal tissue is selected from epithelial tis- 
sue in colon and/or rectum. 



5. The method according to claim 4, wherein the animal tissue is mucosa. 

35 
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6. The method of any of the preceding claims, wherein the biological condition is 
an adenocarcinoma, a carcinoma, a teratoma, a sarcoma, and/or a lymphoma. 

7. The method of any of the preceding claims, wherein the sample is a biopsy of 
5 the tissue. 



8. The method according to any of the preceding claim 1-6, wherein the sample is 
a cell suspension made from the tissue. 

9. The method according to any of the preceding claims, wherein the sample com- 
1 0 prises sut^tantially only ceils from said tissue. 



10. The method according to claim 9, wherein the sample comprises substantially 
only cells from mucosa. 

15 11. The method according to any of the claims 3-1 0, wherein the gene from the first 
gene group is selected individually from genes comprising a sequence as identi- 
fied below 



RC H04768 at 



chrom 1 5 no homology 



RC_Z39652_at 

RC_H30270_at 

RC_T47089_s_at 

RC_W31906_at 

RC_AA279803_at 

RC_R01646_at 

RC_AA099820_at 
AA319615_at 

H07011_at 

RC_T68873_f_at 

RC_T40995_f_at 

RC_H81070_f_at 

RC_N30796_at 

RC_W37778_f_at 

RC_R70212_s_at 

RC AA426330 at 



Y14593 APM-1 gene adipocyte-specific secretory protein; 
chrom 1q21.3-q23 

chrom 18 PAAAA in colon & bladder no homology 
tenascin-X; tenascin-X precursor; unidentified protein 
secretagogin; dJ501N12.8 (putative protein) chrom 6 
chrom 2 no homology 

chrom 13q32. 1-33.3 ; AL159152 ; homology to mouse 
Pcbpl - poly(rC)-binding protein 1 
BAG clone AC016778 

secretory canier membrane protein; secretory carrier mem- 
brane protein 2; chrom 15 

tetraspan NET-6 mRNA; transmembrane 4 superfamiiy; 
chrom 7 
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RC_N33927_s_at 
RC_T90190_s_at 
RC_AA447145_at 
RC_H75860_at 
RC T71132 s at 



wherein the notation refers to Accession No. in the database UniGene (Build 
18). 



5 12. The method according to claim 1 1 . wherein the gene from the first gene group is 
selected individually from genes comprising a sequence as identified below 



RC H04768 at 



chrom 15 no homology 



RC_Z39652_at 

RC_H30270_at 

RC_T47089_s_at 

RC_W31906_at 

RC_AA279803_at 

RC_R01646_at 

RC_AA099820_at 
AA319615_at 

H07011 at 



Y14593 APM-1 gene adipocyte-specific secretory protein; 
chrom 1q21.3-q23 

chrom 18 PAAAA in colon & bladder no homology 
tenascin-X; tenascin-X precursor; unidentified protein 
secretagogin; dJ501N12.8 (putative protein) chrom 6 
chrom 2 no homology 

chrom 13q32. 1-33.3 ; AL159152 ; homology to mouse 
Pcbpl - poly(rC)-binding protein 1 
BAG clone AC01 6778 

secretory carrier membrane protein; secretory carrier mem- 
brane protein 2; chrom 15 

tetraspan NET-6 mRNA; transmembrane 4 superfamily; 
chrom 7 



10 



wherein the notation refers to Accession No. in the database UniGene (Build 



18). 



13. The method according to claim 12, wherein the gene from the first gene group is 
selected individually from genes comprising a sequence as identified below 



RC_H04768_at chrom 15 no homology 



RC_Z39652_at Y14593 APM-1 gene adipocyte-specific secretory protein; 

chrom 1q21.3-q23 

RC_H30270_at chrom 18 PAAAA in colon & bladder no homology 

RC_T47089_s_at tenascin-X; tenascin-X precursor; unidentified protein 
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RC_W31906_at 

RC_AA279803_at 

RC_R01646_at 

AA319615_at 



secretagogin; dJ501N12.8 (putative protein) chrom 6 
chrom 2 no homology 

chrom 13q32. 1-33.3 ; AL159152 ; homology to mouse 
Pcbp1 - poly(rC)-binding protein 1 

secretory carrier membrane protein; secretory carrier mem- 
brane protein 2; chrom 15 



wherein the notation refers to Accession No. in the database UniGene (Build 



18). 



14. The method according to claim 13. wherein the gene from the first gene group is 
selected individually from genes comprising a sequence as identified below 



RC_T47089_s_at tenascin-X; tenascin-X precursor; unidentified protein 

RC_W31906_at secretagogin; dJ501 N12.8 (putative protein) chrom 6 

RC_AA279803_at chrom 2 no homology 

AA319615_at secretory carrier membrane protein; secretory carrier mem- 
brane protein 2; chrom 15 

wherein the notation refers to Accession No. in the database UniGene (Build 18) 

10 

15. The method according to any of claims 3-14, wherein the second gene group 
are selected individually from genes comprising a sequence as identified below 



RC 


AA609013 


s at 


RC 


_AA232508_ 


_at 


RC 


_AA428964_ 


.at 


RC 


T52813_s_ 


at 


RC 


AA075642 


at 


RC 


_AA007218_ 


.at 


RC. 


_N33920_at 


RC 


N71781 at 




RC 


R67275 s 


at 



microsomal dipeptidase (also on 6.8k); chrom 16 



CGI-89 protein; unnamed protein product; hypothetical 
protein 

serine protease-like protease; serine protease homo- 
log=NES1; normal epithelial cell-specific 1 
dJ28O10,2 (G0S2 (PUTATIVE LYMPHOCYTE G0/G1 
SWITCH PROTEIN 2; chrom 1 
gp-340 variant protein; DMBT1/8kb.2 protein 
chrom 13 no homology 

ubiquitin-like protein FAT10; diubiquitin; dJ271M21.6 (Di- 
ubiquitin); chrom 6 
KIAA1199 protein, chrom 15 

alpha-1 (type XI) collagen precursor; collagen, type XI, 
alpha 1 ; collagen type XI alp 
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RC_W80763_at 

RC_AA443793_at 

RC_AA034499_s_at 

RC_AA035482_at 
RC_AA024482_at 
RC_H93021_at 

RC_AA427737_at 

RC_AA417078_at 

M29873_s_at 

RC_H27498_f_at 

RC_T92363_s_at 

RC_N89910_at 

RC_W60516_at 

RC_AA219699_at 

RC AA449450 at 



ha-1 isoform A; chrom 1 
hypothetical protein; chrom 17 
chrom 7p22 AC006028 BAC clone 
ZNF198 protein; zinc finger protein; FIM protein; Cys-rich 
protein; zinc finger protein 198; chrom 13 
chrom 5; AK022505 clone; CalcineurinB (weakly similar) 
hypothetical protein; unnamed protein product; chrom 17 
chrom 2 ; XM_004890 peptidylprolyl isomerase A (cy- 
clophilin A) 
no homology 

chrom 7q31; AF017104 clone 

cytochrome P450-IIB (hllB3) ; 19q13.1-q13.2 



wherein the notation refers to Accession No. in the database UniGene (Build 



18). 



16. The method according to any of claims 3-15, wherein the second gene group 
are selected individually from genes comprising a sequence as identified below 



RC_AA609013_s_at microsomal dipeptidase (also on 6.8k); chrom 16 



RC. 


_AA232508_ 


.at 


RC. 


_AA428964_ 


.at 


RC. 


_T52813_s_ 


at 


RC 


AA075642 


at 


RC 


AA007218 


.at 


RC. 


_N33920_at 




RC 


N71781 at 




RC. 


_R67275_s_ 


.at 



CGI-89 protein; unnamed protein product; hypothetical 
protein 

serine protease-like protease; serine protease homo- 
log=NES1; normal epithelial cell-specific 1 
dJ28O10.2 (G0S2 (PUTATIVE LYMPHOCYTE G0/G1 
SWITCH PROTEIN 2; chrom 1 
gp-340 variant protein; DMBT1/8kb.2 protein 
chrom 13 no homology 

ubiquitin-like protein FAT10; diubiqultin; dJ271M21.6 (Di- 
ubiquitin); chrom 6 
KIAA1199 protein, chrom 15 

alpha-1 (type XI) collagen precursor; collagen, type XI. 
alpha 1 ; collagen type XI alpha-1 isoform A; chrom 1 
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RC_W80763_at 

RC_AA443793_at 

RC_AA034499_s_at 

RC_AA035482_at 
RC_AA024482_at 
RC_H93021_at 

RC_AA427737_at 
RC_AA417078_at 
M29873 s at 



hypothetical protein; chrom 17 
chrom 7p22 AC006028 BAC clone 
ZNF198 protein; zinc finger protein; FIM protein; Cys-rich 
protein; zinc finger protein 198; chrom 13 
chrom 5; AK022505 clone; CalcineurinB (weakly similar) 
hypothetical protein; unnamed protein product; chrom 17 
chrom 2 ; XM_004890 peptidylprolyt isomerase A (cy- 
clophilin A) 
no homology 

chrom 7q31; AF017104 clone 
cytochrome P450-IIB (hllB3) ; 19q13.1-q13.2 



wherein the notation refers to Accession No. in the database UniGene (Build 



18). 



17. The method according to any of claims 3-14. wherein the second gene group 
are selected Individually from genes comprising a sequence as identified t>elow 



RC 


AA609013 


s at 


RC 


_AA232508_ 


.at 


RC 


_AA428964_ 


at 


RC 


AA075642 


at 


RC 


AA007218 


at 


RC. 


_N33920_at 




RC 


N71781 at 




RC. 


_R67275_s_ 


at 


RC 


W80763_at 


RC. 


_AA034499_ 


s_at 


RC 


AA035482 


at 


RC 


AA024482 


at 


RC. 


_H93021_at 




RC 


AA427737 


at 



microsomal dipeptidase (also on 6.8k); chrom 16 

CGI-89 protein; unnamed protein product; hypothetical 
protein 

serine protease-like protease; serine protease homo- 
log=NES1; normal epithelial cell-specific 1 
gp-340 variant protein; DMBT1/8kb.2 protein 
chrom 13 no homology 

ubiquitin-like protein PATIO; diubiquitin; dJ271M21.6 (Di- 
ubiquitin); chrom 6 
KIAA1199 protein, chrom 15 

alpha-1 (type XI) collagen precursor, collagen, type XI, 
alpha 1; collagen type XI alpha-1 isoform A; chrom 1 
hypothetical protein; chrom 17 

ZNF198 protein; zinc finger protein; FIM protein; Cys-rich 
protein; zinc finger protein 198; chrom 13 
chrom 5; AK022505 clone; CalcineurinB (weakly similar) 
hypothetical protein; unnamed protein product; chrom 17 
chrom 2 ; XM_004890 peptidylprolyl isomerase A (cy- 
clophilin A) 
no homology 
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RC_AA41 7078_at chrom 7q31 ; AFdl 71 04 clone 

M29873_s,at cytochrome P450-HB (hllB3) ; 19q13,1-q13.2 

wherein the notation refers to Accession No. in the database UniGene (Build 
18). 

5 18. The method according to any of claims 3-17, wherein the second gene group 
comprises a sequence as identified below 

|RC_W80763_at I jhypothetical protein; chrom 17 | 

wherein the notation refers to Accession No. in the database UniGene (Build 
10 18). 

19. The method according to any of the preceding claims, wherein the expression 
level of at least two genes from the first gene group are determined. 

15 20. The method according to any of the preceding claims, wherein the expression 
level of at least two genes from the second gene group are determined. 

21 . The method according to any of the preceding claims, further comprising the 
steps of determining the stage of a biological condition in the animal tissue, 

20 comprising assaying a third expression level of at least one gene from a third 

gene group, wherein a gene from said second gene group, in one stage, is ex- 
pressed differently from a gene from said third gene group. 

22. The method according to any of the preceding claims, wherein the difference in 
25 expression level of a gene from one group to the expression level of a gene from 

another group is at least two-fold. 

23. The method according to any of the preceding claims, wherein the difference in 
30 expression level of a gene from one group to the expression level of a gene from 

another group is at least three-fold. 
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24. The method according to any of the preceding claims, wherein the expression 
level is determined by determining the mRNA of the cells. 

25. The method according to any of the claims 1-23. wherein the expression level is 
5 determined by determining expression products, such as peptides, in the cells. 

26. The method according to claim 25, wherein the expression level is determined 
by determining expression products, such as peptides, in the body fluids, such 
as blood, serum, plasma, faeces, mucus, sputum, cerebrospinal fluid, and/or 

10 urine. 

27. A method of detemnining the stage of a biological condition in animal tissue, 
comprising collecting a sample comprising cells from the tissue, 

15 

assaying the expression of at least a first stage gene from a first stage gene 
group and/or at least a second stage gene from a second stage gene group, 
wherein at least one of said genes is expressed in said first stage of the condi- 
tion in a higher amount than in said second stage, and the other gene is a ex- 
20 pressed in said first stage of the condition in a lower amount than in said second 

stage of the condition, 

correlating the expression level of the assessed genes to a standard level of ex- 
pression determining the stage of the condition. 

25 

28. The method according to claim 27, wherein the tissue is selected from the 
epithelial tissue in colon or rectum. 

29. The method according to any of the preceding claims 27-28, wherein the differ- 
30 ence in expression levels between a gene from one group to a gene from an- 
other group is at least one-fold. 

30. The method according to any of the preceding claims 27-29, wherein the differ- 
ence in expression levels between a gene from one group to a gene from an- 

35 other group is at least two-fold. 
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10 



15 



20 



31. The method according to claim 27, wherein the stage is selected from colon 
cancer stages Dukes A, Dukes B, Dukes C, and Dukes D. 

32. The method according to claim 31, comprising assaying at least the expression 
of Dukes A stage gene from a Dukes A stage gene group, at least one Dukes B 
stage gene from a Dukes B stage gene group, at least the expression of Dukes 
C stage gene from a Dukes C stage gene group, and at least one Dukes D 
stage gene from a Dukes D stage gene group, wherein at least one gene from 
each gene group is expressed in a significantly different amount in that stage 
than in one of the other stages. 

33. The method according to claim 32, wherein at least one gene firom each gene 
group is expressed in a significantly higher amount in that stage than in one of 
the other stages. 

34. The method according to claim 33, wherein a Dukes A stage gene is selected 
individually from any gene comprising a sequence as identified below 



RC_AA599199_at ALU sag. 

RC_R12694_at unnamed protein product BAA91641, chrom 10 

RC_H91325_s_at aldolase B; aldolase B (aa 1-364); chrom 9 

RC_N51709_at - chrom X 

RC_N72610_at 

RC_N69263_at chrom 10; AK026414 clone (only 108 nt hom) 
RC_T1 581 7_f_at iNOS, inducible nitric oxide synthase 

wherein the notation refers to Accession No. in the database UniGene (Build 



35. The method according to claim 33. wherein a Dukes B stage gene is selected 
individually from any gene comprising a sequence as identified below 



18), 



RC T67463 s at 



cathepsin 02; X; K 



RCJA/94688_at 
RC AA1 26743 at 



perilipin 

Z97200 PAC chrom 1q24; 



SUBSTITUTE SHEET (RULE 26) 



>0OCID: <WO 0149879A2 I > 



wo 01/49879 



140 



PCT/DKOO/00744 



PMX1 honieobox gene 
RC_AA236547_at no homology 

RC_AA255567_at angiopoietin-related protein-2; angiopoietin-like 2 

RC_AA421256_at 

RC_AA386386_s_at PPPPP - 

RC _AA452549_at PPPPP PR01659; hypothetical protein chrom 1 1 



wherein the notation refers to Accession No. in the database UniGene (Build 
18). 



5 36. The method according to claim 33. wherein a Dukes C stage gene is selected 
individually from any gene comprising a sequence as identified below 



RC 


D45556 at 




RC 


W86214 at 


RC 


AA039439 


s at 


RC 


AA1 28935 


at 


RC 


AA134158 


s at 


RC 


AA232646 


at 


RC 


■AA401184 


at 


RC 


AA436840 


at 


RC 


AA488655 


at 


RC 


AA181902 


at 



chrom 15; AL390085 clone 



novel gene KIAA0134 protein 19q13.3 

class I homeodomain; homeobox protein, chrom 7 
chrom 17, AF266756 sphingosine kinase (SPHK1 
no homology 

PPPPP AC007201 on chrom 19 (only 80nt horn) 

wherein the notation refers to Accession No. in the database UniGene (Build 
10 18). 

37. The method according to claim 33, wherein a Dukes D stage gene is selected 
individually from any gene comprising a sequence as identified below 



RC_N91 920_at AAAAP chrom 1 6p1 2-p1 1 .2 ; XN_007994 retinoblastoma bin- 

ding protein 

RC AA621601_at AAAAP chrom 17 XM_009868 RAB36 ARS oncogene family 

15 



20 
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10 



wherein the notation refers to Accession No. in the database UniGene (Build 
18). 

38. The method according to daim 32, wherein at least one gene from each gene 
group is expressed in a significantly lower amount in that stage than in one of 
the other stages. 

39. The method according to claim 38, wherein a Dukes A stage gene is selected 
individually from any gene comprising a sequence as identified below 



RC_N3241 1_f_at PAPPP Myc-associated zinc-finger protein of human islet; 

chrom 16 

KIAA0882 protein 

ras-like protein; ras-related C3 botulinum toxin sub 
strate; dJ20J23 



beta-4 



RC 


AA243858 


at 


PAPPP 


RC. 


_AA486283 


_at 


PAPPP 


RC 


AA490930 


at 


PAPPP 


RC 


H54088 s 


at 


ppppp 


RC. 


_H59052_f_ 


at 


ppppp 


RC 


R49198 s 


.at 


ppppp 


RC 


T73572 f 


at 


PPPPP 


RC 


AA477483 


at 


ppppp 



wherein the notation refers to Accession No. in the database UniGene (Build 
18). 

15 40. The method according to claim 38, wherein a Dukes B stage gene is selected 
individually from any gene comprising a sequence as identified below 

RC_D59847_at PPAPP proSAAS; granin-like neuroendocrine peptide pre- 

cursor 

RC_F05038_at PPAPP polyamine modulated factor-1 ; polyamine modulated 

factor 1 

RC N41059 at PPAPP chrom 3 
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RC_T23460_at PPAPP chrom 3; IFNAR2 21q22.11 

RC_W42789_at PPAPP chrom 8 AF268037 C80RF4 protein (C80RF4) 

chrom 8 ORF 

RC_AA460017J_at PPAPP BAC clone chrom 16 

RC_AA482127_at PPAPP KIAA1 142 protein 

RC_AA504806_at PPAPP chrom 2 AF052107 clone 23620 mRNA sequence 

RC_T90037_at PPPPP unnamed protein product, chrom 4 

RC_AA432130_at PPPPP KIAA0867 protein, chrom 12 

wherein the notation refers to Accession No. in the database UniGene (Build 
18). 

41 . The method according to claim 38. wherein a Dukes C stage gene is selected 
individually from any gene comprising a sequence as identified below 



RC_N30231_at PPPAP Lsm4 protein; U6 snRNA-assodated Sm-like protein 

LSm4; glycine-rich protein 

RC_W73790_f„at PPPAP immunoglobulin-related protein 14.1; lambda L-chain 

C region; omega protein, chrom 22 
RC_AA4 1 2 1 84_at PPPAP chrom 1 p36; d89060 dolichyl- 

diphosphooligosaccharide-protein glycosyltransferase 
RC_AA521303_at PPPAP methionine adenosyltransferase regulatory beta subu- 

nit; dTDP-4-keto-6-deoxy-D-glucose 4-reductase, 

chrom 5 

RC_AA461 1 74_at PPPPP 8p21 .3-p22 AB020860 anti-oncogene 
AA393432_s_at PPPPP chrom 2, Unknown; unnamed protein product A- 

AD20029 • 



10 wherein the notation refers to Accession No. in the database UniGene (Build 

18). 
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42. The method according to claim 38, wherein a Dukes O stage gene is selected 
individually from any gene comprising a sequence as identified below 



RC R72886 s_at PPPPA KIAA0422; adenylyl cyclase type VI. chrom 12 

chrom 1 

hypothetical protein, chrom 17 
chrom 19; ac011491 clone and 20 nt hom. RAB2, 
RAS oncogene family 

growth-arrest-specific protein; growth arrest-specific 
6; AXL stimulatory factor, chrom 13 
chrom 5 EST; hom to chrom 20 AL356652 clone 
ras related protein RabSb; RAB5B, member RAS 
oncogene family 

HLA-drb5; cell surface glycoprotein; MHC HLA-DR- 
beta chain precursor chrom 6 
chrom 11; AC004584 chrom 17 
chrom 22 ? X96924 mitochondrial citrate tranbsport 
region 

ak024908 clone 
singleton ak025344 clone 

chrom 17 ? Synaptobrevin2 (VAMP2) AF1 35372 
6p21 HLA class i region; AC004202 clone 
chrom 6, unknown; ring finger protein 5 
oligosaccharyltransferase d89060 1p3_6.1 (also C- 
class) 

chrom 1 no homology 
chrom 1 1 ; ac005233 PAC clone chrom 22 
similar to CAA1 6821 (PID:g3255952) 



RC 


_AA026030_at 


PPPPA 




_Zo9uuo__at 


DDDD A 




_AA43o9UO_at 


DDDD A 




_AA057o29_S_ai 


DDDD A 


RC 


R72087 at 


PPPPA 


RC 


^H04242_at 


PPPPA 


RC. 


_R97304_T_at 


ODDD A 


RC 


N48609 at 


PPPPA 


RC 


_W86850_f_at 


PPPPA 


RC 


AA1 30603 at 


PPPPA 


RC 


AA479610 at 


PPPPA 


RC 


AA490593 i at 


PPPPA 


RC 


AA054321 s at 


PPPPA 


RC 


D60328 at 


ppppp 


RC, 


_H96850_at 


PPPPP 


RC 


AA1 27444 at 


PPPPP 


RC 


AA242824 at 


ppppp 


AA405775 s at - 


PPPPP 



wherein the notation refers to Accession No. in the database UniGene (Build 



18). 
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43. A method of determining an expression pattern of a colon cell sample, compris*- 
ing: 

5 collecting sample comprising colon and/or rectum cells and/or expression prod- 

ucts from colon and/or rectum ceils. 

determining the expression level of two or more genes in the sample, wherein at 
least one gene belongs to a first group of genes, said gene from the first gene 

10 group being expressed in a higher amount in normal tissue than in biological 

condition cells, and wherein at least one other gene belongs to a second group 
of genes, said gene from the second gene group being expressed in a lower 
amount in normal tissue than in biological condition cells, and the difference 
between the expression level of the first gene group in normal cells and biologi- 

15 cal condition cells t>eing at least two-fold, obtaining an expression pattem of the 

colon and/or rectum cell sample. 

44. the method of claim 43, wherein the two or more genes exclude genes which 
are expressed in the submucosal, muscle, or connective tissue, whereby a pat- 

20 tern of expression is formed for the sample which is independent of the propor- 

tion of submucosal, muscle, or connective tissue cells in the sample. 

45. The method of claim 44, comprising determining the expression level of one or 
more genes in the sample comprising predominantly submucosal, muscle, and 

25 connective tissue cells, obtaining a second pattem, subtracting said second 

pattern from the expression pattem of the colon and/or rectum cell sample, 
forming a third pattern of expression, said third pattem of expression reflecting 
expression of the colorectal mucosa or colorectal cancer cells independent of 
the proportion of submucosal, muscle, and connective tissue cells present in the 

30 sample. 

46. The method of any of the preceding claims 43-45, wherein the sample is a bi- 
opsy of the tissue. 
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47. The method according to any of the preceding ciaim 43-46, wherein the sample 
is a cell suspension. 

48. The method according to any of the preceding claims 43-47, wherein the sample 
5 comprises substantially only cells from said tissue. 

49. The method according to claim 48, wherein the sample comprises substantially 
only cells from mucosa. 

10 50. The method according to any of the claims 43-47, wherein the gene from the 
first gene group is selected individually from 



RC H04768 at 



chrom 1 5 no homology 



RC_Z39652_at 

RC_H30270_at 

RC_T47089_s_at 

RC_W31906_at 

RC_AA279803_at 

RC_R01646_at 

RC_AA099820_at 
AA319615_at 

H07011_at 

RC_T68873_f_at 

RC_T40995_f_at 

RC_H81070_f_at 

RC_N30796_at 

RC_W37778J_at 

RC_R70212_s_at 

RC_AA426330_at 

RC_N33927_s_at 

RC_T90190_s_at 

RC_AA447145_at 

RC_H75860_at 

RC T71132 s at 



Y14593 APM-1 gene adipocyte-specific secretory protein; 
chrom 1q21.3-q23 

chrom 18 PAAAA in colon & bladder no homology 
tenascin-X; tenascin-X precursor; unidentified protein 
secretagogin; dJ501N12.8 (putative protein) chrom 6 
chrom 2 no homology 

chrom 13q32. 1-33.3 ; AL159152 ; homology to mouse 
Pcbpl - poly(rC)-binding protein 1 
BAG clone AC01 6778 

secretory carrier membrane protein; secretory earner mem- 
brane protein 2; chrom 15 

tetraspan NET-6 mRNA; transmembrane 4 superfamily; 
chrom 7 
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wherein the notation refers to Accession No. in the database UntGene (Build 



18). 



51. The method according to claim 50, wherein the gene from the first gene group is 
selected individually from genes comprising a sequence as identified below 



RC H04768 at 



chrom 15 no homology 



RC_Z39652_at 

RC_H30270_at 

RC_T47089_s_at 

RC_W31906_at 

RC_AA279803_at 

RC_R01646_at 

RC_AA099820_at 
AA319615_at 

H07011 at 



Y14593 APM-1 gene adipocyte-specific secretory protein; 
chrom 1q21.3-q23 

chrom 18 PAAAA in colon & bladder no homology 
tenascin-X; tenascin-X precursor; unidentified protein 
secretagogin; dJ501N12.8 (putative protein) chrom 6 
chrom 2 no homology 

chrom 1 3q32. 1 -33.3 ; AL1 59152 ; homology to mouse 
Pcbpl - poly(rC)-binding protein 1 
BAG clone AC01 6778 

secretory carrier membrane protein; secretory carrier mem- 
brane protein 2; chrom 15 

tetraspan NET-6 mRNA; transmembrane 4 superfamiiy; 
chrom 7 



10 



wherein the notation refers to Accession No. in the database UniGene (Build 



18). 



52. The method according to claim 51 , wherein the gene from the first gene group is 
selected individually from genes comprising a sequence as identified below 



RC H04768 at 



chrom 1 5 no homology 



RC_Z39652_at 

RC_H30270_at 

RC_T47089_s_at 

RCJA/31906_at 

RC_AA279803_at 

RC_R01646_at 

AA319615 at 



Y14593 APM-1 gene adipocyte-specific secretory protein; 
chrom 1q21.3-q23 

chrom 1 8 PAAAA in colon & bladder no homology 
tenascin-X; tenascin-X precursor; unidentified protein 
secretagogin; dJ501N12.8 (putative protein) chrom 6 
chrom 2 no homology 

chrom 13q32. 1-33.3 ; AL159152 ; homology to mouse 
Pcbpl - poly(rC)-binding protein 1 
secretory carrier membrane protein; secre 
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tory carrier membrane protein 2; chrom 15 

wherein the notation refers to Accession No. in the database UniGene (Build 
18). 

5 53. The method according to claim 52, wherein the gene from the first gene group is 
selected tndividualty from genes comprising a sequence as identified below 



RC_T4 7089_s_at tenascin-X; tenascin-X precursor; unidentified protein 

RC_W31906_at secretagogin; dJ501N12.8 (putative protein) chrom 6 

RC_AA279803_at chrom 2 no homology 

AA31 961 5_at secretory carrier membrane protein; secretory carrier mem- 
brane protein 2; chrom 15 

wherein the notation refers to Accession No. in the database UniGene (Build 

10 18). 



54. The method according to any of claims 3-14, wherein the second gene group 
are selected individually from genes comprising a sequence as identified below 



RC 


AA609013 


s at 


RC 


_AA232508_ 


.at 


RC 


_AA428964_ 


.at 


RC 


_T52813_s_ 


at 


RC 


AA075642 


at 


RC 


AA007218 


at 


RC. 


_N33920_at 




RC 


N71781 at 




RC. 


_R67275_s_ 


at 


RC 


W80763 at 


RC 


AA443793 


at 


RC 


AA034499 


s at 



microsomal dipeptidase (also on 6.8k); chrom 16 



CGI-89 protein; unnamed protein product; hypothetical 
protein 

serine protease-like protease; serine protease homo- 
log=NES1; normal epithelial cell-specific 1 
dJ28O10.2 (G0S2 (PUTATIVE LYMPHOCYTE G0/G1 
SWITCH PROTEIN 2; chrom 1 
gp-340 variant protein; DMBT1/8kb.2 protein 
chrom 1 3 no homology 

ubiquitin-like protein FAT10; diubiquitin; dJ271M21.6 (Di- 
ubiquitin); chrom 6 
KIAA1199 protein, chrom 15 

alpha-1 (type XI) collagen precursor; collagen, type XI, 

alpha 1; collagen type XI alpha- 1 isoform A; chrom 1 

hypothetical protein; chrom 17 

chrom 7p22 AC006028 BAC clone 

ZNF198 protein; zinc finger protein; FIM protein; Cys-rich 

protein; zinc finger protein 
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RC_AA035482_at 
RC_AA024482_at 
RC_H93021_at 

RC_AA427737_at 

RC_AA417078_at 

M29873_s_at 

RC_H27498_L.at 

RC_T92363_s_at 

RC_N89910_at 

RC_W60516_at 

RC_AA219699_at 

RC AA449450 at 



198; chrom 13 

chrom 5; AK022505 clone; CalcineurinB (weakly simitar) 
hypothetical protein; unnamed protein product; chrom 17 
chrom 2 ; XM_004890 peptidylprolyl isomerase A (cy- 
clophilin A) 
no homology 

chrom 7q31; AF017104 clone 
cytochrome P450-IIB (hllB3) ; 19q13.1-q13.2 



wherein the notation refers to Accession No. in the database UniGene (Build 



18). 



55. The method according to any of claims 43-49, wherein the second gene group 
are selected individually from genes comprising a sequence as identified below 



RC_AA609013_s__at microsomal dipeptidase (also on 6.8k); chrom 16 



RC_AA232508_at 

RC_AA428964_at 

RC_T52813_s_at 

RC_AA075642^at 
RC_AA007218_at 
RC_N33920_at 

RC_N71781_at 
RC_R67275_s_at 

RC_W80763_at 
RC_AA443793_at 
RC AA034499 s at 



CGI-89 protein; unnamed protein product; hypothetical 
protein 

serine protease-like protease; serine protease homo- 
log=NES1; normal epithelial cell-specific 1 
dJ28O10.2 (G0S2 (PUTATIVE LYMPHOCYTE G0/G1 
SWITCH PROTEIN 2; chrom 1 
gp-340 variant protein; DMBT1/8kb.2 protein 
chrom 1 3 no homology 

ubiquitin-like protein FAT10; diubiquitin; dJ271M21.6 (Di- 
ubiquitin); chrom 6 
KIAA1199 protein, chrom 15 

alpha-1 (type XI) collagen precursor; collagen, type XI, 

alpha 1; collagen type XI alpha-1 isoform A; chrom 1 

hypothetical protein; chrom 17 

chrom 7p22 AC006028 BAC clone 

ZNF198 protein; zinc finger protein; FIM protein; Cys-rich 

protein; zinc finger protein 198; chrom 13 
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RC_AA035482_at 
RC_AA024482_at 
RC_H93021_at 

RC_AA427737_at 
RC_AA417078_at 
M29873 s at 



chrom 5; AK022505 clone; CalcineurinB (weakly similar) 
hypothetical protein; unnamed protein product; chrom 17 
chrom 2 ; XM_004890 peptidylprolyl isomerase A (cy- 
clophilin A) 
no homology 

chrom 7q31; AF017104 clone 

cytochrome P450-IIB (hllB3) ; 19q13.1-q13.2 ^ 



wherein the notation refers to Accession No. in the database UniGene (Build 



18). 



56. The method according to any of claims 43-49. wherein the second gene group 
are selected individually from genes comprising a sequence as identified below 



RC_AA609013_s_at microsomal dipeptidase (also on 6.8k); chrom 16 



RC_AA232508_at 

RC_AA428964_at 

RC_AA075642_at 
RC_AA007218_at 
RC_N33920_at 

RC_N71781_at 
RC_R67275_s_at 

RC_W80763_at 
RC_AA034499_s_at 

RC_AA035482_at 
RC_AA024482_at 
RC_H93021_at 

RC_AA427737_at 
RC_AA417078_at 
M29873 s at 



CGI-89 protein; unnamed protein product; hypothetical 
protein 

serine protease-like protease; serine protease homo- 
log=NES1; normal epithelial cell-specific 1 
gp-340 variant protein; DMBT1/8kb.2 protein 
chrom 13 no homology 

ubiquitin-like protein FAT10; diubiquitin; dJ271M21.6 (Di- 
ubiquitin); chrom 6 
KIAA1199 protein, chrom 15 

alpha-1 (type XI) collagen precursor; collagen, type XI, 
alpha 1 ; collagen type XI alpha-1 isoform A; chrom 1 
hypothetical protein; chrom 17 

ZNF198 protein; zinc finger protein; FIM protein; Cys-rich 
protein; zinc finger protein 198; chrom 13 
chrom 5; AK022505 clone; CalcineurinB (weakly similar) 
hypothetical protein; unnamed protein product; chrom 17 
chrom 2 ; XM_004890 peptidylprolyl isomerase A (cy- 
clophilin A) 
no homology 

chrom 7q31; AF017104 clone 

cytochrome P450-IIB (hllB3) ; 19q13.1-q13.2 
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wherein the notation refers to Accession No. in the database UnlGene (Build 
18). 

5 57, The method according to any of claims 43-49, wherein the second gene group 
comprises a sequence as identified below 

|RC__W80763_at I iHypothetical protein; chrom 17 1 

wherein the notation refers to Accession No. in the database UniGene (Build 
10 18). 

58. The method according to any of the preceding claims 43-57, wherein the ex- 
pression level of at least two genes from the first gene group are detemiined 

15 59. The method according to any of the preceding claims 43-58, wherein the ex- 
pression level of at least two genes from the second gene group are determined. 

60. A method of determining an expression pattern of a colon cell sample independ- 
ent of the proportion of submucosal, muscle, or connective tissue cells present, 
20 comprising: 

determining the expression of one or more genes in a sample comprising cells, 
wherein the one or more genes exclude genes which are expressed in the sub- 
mucosal, muscle, or connective tissue, whereby a pattern of expression is 
25 formed for the sample which is independent of the proportion of submucosal, 

muscle, or connective tissue cells in the sample. 



61 . The method according to claim 60. comprising determining the expression level 
of one or more genes in the sample comprising predominantiy submucosal. 
30 muscle, and connective tissue cells, obtaining a second pattern, subtracting said 

second pattem from the expression pattern of the colon and/or rectum cell sam- 
ple, forming a third pattem of expression, said third pattern of expression re- 
flecting expression of the colon cells independent of the proportion of submu- 
cosal, muscle, and connective tissue cells present in the sample. 

35 
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62. A method of determining the presence or absence of a biological condition in 
human colon and/or rectum tissue comprising, 

collecting a sample comprising cells from the tissue, 

5 

determining an expression pattern of the ceils as defined in any of claims 43-61. 
correlating the determined expression pattern to a standard pattern, 
1 0 determining the presence or absence of the biological condition of said tissue. 

63. A method for determining the stage of a biological condition in animal tissue, 
comprising 

1 5 collecting a sample comprising cells from the tissue, 

determining an expression pattern of the cells as defined in any of claims 43-61 , 
correlating the determined expression pattern to a standard pattern, 

20 

determining the stage of the biological condition is said tissue. 

64. A method for reducing cell tumorigenicity of a cell, said method comprising 

25 contacting a tumor cell with at least one peptide expressed by at least one gene 
selected from genes being expressed in an at least two-fold higher in normal cells 
than the amount expressed in said tumor cell. 

65. The method according to claim 64. wherein the at least one gene is selected 
30 individually from genes comprising a sequence as identified below 



RC 


H04768 


at 


chrom 1 5 no homology 


RC. 


_Z39652_ 


at 


Y14593 APM-1 gene adipocyte-specific secretory protein; 








chrom 1q21.3-q23 


RC 


H30270 


.at 


chrom 1 8 PAAAA in colon & bladder no homology 


RC. 


_T47089_ 


.s_at 


tenascin-X; tenascin-X precursor; unidenti 



iOOCID:<VVO OI49a79A2 I > 



SUBSTITUTE SHEET (RULE 26) 



wo 01/49879 



152 



PCT/DKOO/00744 



fied protein 

RC_W31906_at secretagogin; dJ501N12.8 (putative protein) chrom 6 

RC_AA279803_at chrom 2 no homology 

RC_R01646_at chrom 13q32. 1-33.3 ; AL159152 ; homology to mouse 

Pcbp1 - poly(rC)-binding protein 1 
AA319615_at secretory carrier membrane protein; secretory carrier mem- 
brane protein 2; chrom 1 5 

wherein the notation refers to Accession No. in the database UniGene (Build 
18). 



5 66. The method according to claim 64 or 65, wherein the tumor cell is contacted with 
at least two different peptides. 



67. A method for reducing cell tumorigenidty of a cell, said method comprising 



10 obtaining at least one gene selected from genes being expressed in an at least two- 
fold higher in normal cells than the amount expressed in said tumor cell, 



introducing said at least one gene into the tumor cell in a manner allowing 
expression of said gene(s). 

15 

68. The method according to claim 67, where the at least one gene is selected 
individually from genes comprising a sequence as identified below 



RC_H04768_at chrom 1 5 no homology 

RC_Z39652_at Y14593 APM-1 gene adipocyte-specific secretory protein; 

chrom 1q21.3-q23 

RC_H30270_at chrom 1 8 PAAAA in colon & bladder no homology 

RC_T47089_s_at tenascin-X; tenascin-X precursor; unidentified protein 
RC J/V31 906_at secretagogin; d J501 N 1 2.8 (putative protein) chrom 6 

RC_AA279803_at chrom 2 no homology 

RC_R01646_at chrom 13q32.1-33.3 ; AL159152 ; homology to mouse 

Pcbp1 - poly(rC)-binding protein 1 
AA319615_at secretory carrier membrane protein; secretory carrier mem- 
brane protein 2; chrom 15 
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wherein the notation refers to Accession No. in the database UniGene (Build 

18). 

5 69. The method according to claim 67 or 68. wherein at least two different genes are 
introduced into the tumor cell. 

70. A method for reducing cell tumorigenicity of a cell, said method comprising 

10 obtaining at least one nucleotide probe capable of hybridising with at least one gene 
of a tumor cell, said at least one gene being selected from genes t>eing expressed in 
an amount at least one-fold lower in normal cells than the amount expressed in said 
tumor cell, and 

15 introducing said at least one nucleotide probe into the tumor ceil in a manner 
allowing the probe to hybridise to the at least one gene, thereby inhibiting 
expression of said at least one gene. 

71. The method according to claim 70, wherein the nucleotide probe is selected 

20 from probes capable of hybridising to a nucleotide sequence comprising a sequence 
as identified below 



RC_ AA609013_s_at APPPP microsomal dipeptidase (also on 6.8k); chrom 16 

CGI-89 protein; unnamed protein product; hypothe- 
tical protein 

serine protease-like protease; serine protease ho- 
molog=NES1; normal epithelial cell-specific 1 
dJ28O10.2 (G0S2 (PUTATIVE LYMPHOCYTE 
G0/G1 SWITCH PROTEIN 2; chrom 1 
gp-340 variant protein; DMBT1/8kb.2 protein 
chrom 13 no homology 
ubiquitin-like protein FAT10; diubiquitin; 
dJ271M21.6 (Diubiquitin); chrom 6 
KtAA1199 protein, chrom 15 



RC 


_AA232508. 


at 


APPPP 


RC 


_AA428964_ 


at 


APPPP 


RC 


_T52813_s_ 


at 


APPPP 


RC 


AA075642 


at 


APPPP 


RC 


AA007218 


at 


APPPP 


RC. 


.N33920_at 




APPPP 


RC 


N71781 at 




APPPP 



lOOCIO: <WO 0149879A2 I > 



SUBSTITUTE SHEET (RULE 26) 



wo 01/49879 



154 



PCT/DKOO/00744 



RC_R67275_s_at APPPP 



RC_W80763_at APPPP 

RC_AA443793_at APPPP 

RC_AA034499_s_at APPPP 

RC_AA035482_at APPPP 

RC_AA024482_at APPPP 

RC_H93021_at APPPP 

RC_AA427737_at APPPP 

RC_AA417078_at APPPP 

M29873_s_at APPPP 

RC_H27498_f_at AAPPP 

RC_T92363_s_at AAPPP 

RC_N89910_at AAAPP 

RC_W60516_at AAAPP 

RC_AA219699_at AAAPP 

RC AA449450 at AAAPP 



alpha-1 (type XI) collagen precursor; collagen, type 
XI. alpha 1; collagen type XI alpha-1 isoform A; 
chrom 1 

hypothetical protein; chrom 17 

chrom 7p22 AC006028 BAG clone 

ZNF198 protein; zinc finger protein; FIM protein; 

Cys-rich protein; zinc finger protein 198; chrom 13 

chrom 5; AK022505 clone; CalcineurinB (weakly 

similar) 

hypothetical protein; unnamed protein product; 
chrom 17 

chrom 2 ; XM_004890 peptidylprolyl isomerase A 
(cyclophilin A) 
no homology 

chrom 7q31; AF017104 clone 
cytochrome P450-IIB (hllB3) ; 19q13.1-q13.2 



wherein the notation refers to Accession No. in the database UniGene (Build 



18). 



72. The method according to claim 70 or 71 , wherein at least two different genes are 
introduced into the tumor cell. 



73. A method for producing antibodies against an expression product of a cell from a 
10 biological tissue, said method comprising the steps of 
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obtaining expression procluct(s) from at least one gene said gene berng expressed 
as defined in any of claims 27-37, 

immunising a mammal with said expression product(s) obtaining antibodies against 
5 the expression product. 

74. A pharmaceutical composition for the treatment of a biological condition 
comprising at least one antibody produced as described in claim 73. 

10 75. A vaccine for the prophylaxis or treatment of a biological condition comprising at 
least one expression product from at least one gene said gene being expressed as 
defined in any of claims 27-37. 

76. The use of a method as defined in any of claims 1-63 for producing an assay for 
15 diagnosing a biological condition in animal tissue. 

77. The use of a peptide as defined in any of claims 64-66 for preparation of a 
pharmaceutical composition for the treatment of a biological condition in animal 
tissue. 

20 

78. The use of a gene as defined in any of claims 67-69 for preparation of a 
pharmaceutical composition for the treatment of a biological condition in animal 
tissue. 

25 79. The use of a probe as defined in any of claims 70-72 for preparation of a 

pharmaceutical composition for the treatment of a biological condition in animal 
tissue. 

80. An assay for determining the presence or absence of a biological condition in 
30 animal tissue, comprising 

at least one first marker capable of detecting a first expression level of at least 
one gene from a first gene group, wherein the gene from the first gene group is 
selected from genes expressed in normal tissue cells in an amount higher than 
35 expression in biological condition cells, 
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at least one second marker capable of detecting a second expression level of at 
least one gene from a second gene group, wherein the second gene group is 
selected from genes expressed in normal tissue cells in an amount lower than 
5 expression in biological condition cells. 

81 . The assay according to claim 80, wherein the marker is a nucleotide probe. 

82. The assay according to claim 80, wherein the marker is an antibody. 

10 

83. The assay according to claim 80, wherein the genes are as defined in any of 
claims 11-18. 34-37, and 39-42, 

84. An assay for determining an expression pattem of a colon and/or rectum cell, 
15 comprising at least a first marker and a second marker, wherein the first marker is 

capable of detecting a gene from a first gene group as defined in claim 43, and the 
second marker is capable of detecting a gene from a second gene group as defined 
in claim 43. 

20 85. The assay according to claim 84. wherein the first marker is capable of detecting 
one gene as identified in Table I, and the second marker is capable of detecting 
another gene as identified in Table I. 

86. The assay according to claim 85, comprising at least two markers for each gene 
25 group. 

correlating the first expression level and the second expression level to a standard 
level of the assessed genes to determine the presence or absence of a biological 
condition in the animal tissue. 

30 

87. The assay according to claim 86, wherein the marker is a nucleotide probe 

88. The assay according to claim 86, wherein the marker is an antibody. 
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89. A method for identifying a tissue sample as colo-rectal, comprising subjecting 
the tissue to a method as identified in any of claims 43-61, determining expression 
patterns and comparing the expression pattems determined with expression 
pattems from colo-reclal tissue. 
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